Accurate, scalable, pay-per-use speech-to-text app. Drop audio into S3, get transcripts back in seconds—no servers to babysit.
This project contains the code to Deepgram's technical guide that explains how to build serverless transctription apps with a Lambda function that transcribes audio uploaded to S3 using Deepgram, writing JSON + TXT transcripts back to S3.
The project wires Amazon S3 event notifications to an SQS that trihhers an AWS Lambda function that calls Deepgram’s REST API to transcribe audio. The Lambda writes two outputs to S3:
- transcripts/.json – full Deepgram response
- transcripts/.txt – best-guess transcript (plain text)
By default, Lambda submits a presigned S3 URL to Deepgram so your audio never leaves your bucket except via a signed, time-limited link.
- Handler code (Python 3.11) ready for Lambda
- Reference IAM policies for least-privilege access
- Docs for console setup, optional SQS buffering, testing, logs, and alarms
- Troubleshooting playbook for the common gotchas (403s, timeouts, etc.)
- AWS account with permissions for S3, Lambda, IAM (and SQS if using the buffer)
- Deepgram account + API key -> 200 USD credits to get started building -> https://deepgram.com/product/speech-to-text
- A small test audio file (.mp3, .wav, or .m4a)
- Lambda with internet egress (keep it out of a private VPC or add NAT)
.
├─ lambda/
│ └─ lambda_function.py # Lambda handler
├─ policies/ # Reference IAM policies for the Lambda execution role
│ ├─ lambda-transcriber-s3.json # S3 read/write for Lambda
│ └─ lambda-sqs-consumer.json # SQS consumer perms
└─ README.mdWhy
s3:GetObjectin the IAM policy on input? Needed for presigned URL signing (S3 checks the signer’s IAM). Whys3:GetObjecton transcripts? For the idempotencyHeadObjectcheck.
Set in Lambda → Configuration → Environment variables:
DEEPGRAM_API_KEY(or)DEEPGRAM_SECRET_NAMEINPUT_PREFIX(default:audio-incoming/)TRANSCRIPTS_PREFIX(default:transcripts/)- Optional:
DG_MODEL(e.g.,nova-3),DG_LANGUAGE(e.g.,en) - Optional (debug):
SKIP_HEAD_CHECK=truewhile IAM is being finalized
- Create an S3 bucket with prefixes:
audio-incoming/,transcripts/. - Create a Lambda function (Python 3.11, arm64). Add a
requestslayer. - Set the environment variables above.
- Attach the policy in
policies/lambda-transcriber-s3.jsonto the Lambda execution role. - Add S3 Event Notification: ObjectCreated on prefix
audio-incoming/➜ Lambda.- Or wire S3 ➜ SQS ➜ Lambda using
policies/lambda-sqs-consumer.json.
- Or wire S3 ➜ SQS ➜ Lambda using
- Upload a small
.mp3/.wavtoaudio-incoming/and check CloudWatch logs. - Verify outputs under
transcripts/(.json+.txt).
aws s3 cp ./sample.mp3 s3://YOUR_BUCKET/audio-incoming/sample.mp3Then check:
CloudWatch Logs → /aws/lambda/ → latest stream
S3 → transcripts/ should contain sample.json and sample.txt
- Direct S3 event
{ "Records":[{ "s3": { "bucket": { "name": "YOUR_BUCKET" }, "object": { "key": "audio-incoming/hello.mp3" } } }] }- SQS carrying S3 event
{ "Records":[{ "messageId":"1", "body":"{\"Records\":[{\"s3\":{\"bucket\":{\"name\":\"YOUR_BUCKET\"},\"object\":{\"key\":\"audio-incoming/hello.mp3\"}}}]}" }] }Key metrics
- Lambda: Invocations, Errors, Throttles, Duration (p95)
- Lambda (SQS): IteratorAge
- SQS: ApproximateNumberOfMessagesVisible, ApproximateAgeOfOldestMessage
- DLQ: Visible messages
Log Insights (latency)
fields @timestamp, @message
| parse @message /"dg_request_ms":\s*(\d+)/
| filter ispresent(@1)
| stats count() as requests, avg(@1) as avg_ms, pct(@1,95) as p95_ms by bin(5m)
| sort @timestamp descSuggested alarms
- Lambda Errors ≥ 1 for 2×5-min
- Lambda p95 Duration > 5s for 2×15-min
- (SQS) IteratorAge > 60s for 2×5-min
- (DLQ) Messages ≥ 1 (immediate)
Deepgram 400 REMOTE_CONTENT_ERROR with 403 in message
- Your presigned URL fetch was denied. Make sure the Lambda execution role has
s3:GetObjecton the input prefix (audio-incoming/*). - Avoid bucket policies that Deny
GetObjectbased on VPC endpoint/IP—these block external fetchers like Deepgram.
HeadObject 403 during idempotency check
- Add s3:GetObject on the transcripts/ prefix. Temporarily set SKIP_HEAD_CHECK=true to unblock.
ModuleNotFoundError: requests
- Add a requests layer or vendor the dep in your deployment.
Lambda not firing
- Check S3 event notification prefix/suffix and that the trigger is attached (or SQS path is wired).
SQS trigger creation error: visibility timeout < function timeout
- Increase queue visibility (e.g., 120–180 s) or shorten Lambda timeout.
In a VPC without internet?
- Add a NAT Gateway or keep the function outside the VPC so it can reach Deepgram.
- Lambda: submit path is ~sub-second, so cost is negligible vs. transcription time.
- Deepgram: billed by audio duration / model—dominant component.
- SQS (optional): per-request; inexpensive for typical workloads.
- Use URL fetch (this guide) so Lambda doesn’t read/stream full files. For very long files, consider Deepgram async + webhooks.
- No bucket policy required for presigned URLs (Block Public Access can remain ON).
- IAM on the Lambda role is what authorizes presigned access.
- If you enable SSE-KMS on inputs/outputs, also grant the role KMS permissions (Decrypt/GenerateDataKey for reads, Encrypt/GenerateDataKey for writes).
- Do not log presigned URLs or secrets.
- Switch to Deepgram async + webhook for multi-minute files or heavy batch jobs.
- Add S3 lifecycle (transition to IA/Glacier) and date partitioning for transcripts.
- Push transcripts and metadata into DynamoDB/OpenSearch for search/analytics.
- Package IaC (SAM/Terraform) and a CI pipeline.
- Add diarization, timestamps, and downstream summarization.
