An AWS Lambda function that processes AWS HealthOmics run state change events, archives them to S3, and forwards them to the NGS360 API server for workflow execution tracking.
This serverless application monitors AWS HealthOmics workflow runs and processes state change events in real-time. When an Omics workflow changes state (e.g., starts, completes, fails), this Lambda function:
- Captures the event from EventBridge
- Flattens the nested JSON structure for easier analysis
- Enriches the event with additional information (for completed/failed/cancelled events)
- Archives the event to an S3 data lake with server-side encryption
- Notifies the NGS360 API server via callback endpoint
AWS HealthOmics β EventBridge β Lambda Function β S3 Data Lake
β
NGS360 API Server
The Lambda function is deployed in a VPC for secure communication with the NGS360 API server and includes:
- Dead Letter Queue (SNS) for failed executions
- CloudWatch Logs for monitoring
- IAM roles with least-privilege permissions
- β Real-time event processing from AWS HealthOmics
- β Event archival to S3 with AES-256 encryption
- β Callback integration with NGS360 API
- β JSON flattening for simplified event structure
- β Event enrichment with run tags, output file mapping, and log URLs
- β WES run ID extraction from run tags for API integration
- β Configurable verbose logging
- β VPC support for secure networking
- β Dead letter queue for error handling
- β Infrastructure as Code via CloudFormation
- AWS Account with appropriate permissions
- AWS CLI configured with credentials
- Python 3.12
- make (for build automation)
- An S3 bucket for the data lake
- NGS360 API Server URL
- VPC with security group and subnet configured
The stack requires the following parameters (typically defined in parameters.json):
| Parameter | Description | Required | Default |
|---|---|---|---|
DeadLetterEmail |
Email address for failed execution notifications | Yes | - |
ApiServer |
NGS360 API Server URL | Yes | - |
DataLakeBucket |
S3 bucket name for event storage | Yes | - |
BucketPrefix |
S3 prefix/folder for events | No | omics-run-events |
VerboseLogging |
Enable debug-level logging | No | false |
SecurityGroupId |
Security group ID for Lambda VPC | Yes | - |
SubnetId |
Subnet ID for Lambda VPC | Yes | - |
[
{
"ParameterKey": "DeadLetterEmail",
"ParameterValue": "alerts@example.com"
},
{
"ParameterKey": "ApiServer",
"ParameterValue": "https://api.ngs360.example.com"
},
{
"ParameterKey": "DataLakeBucket",
"ParameterValue": "my-ngs360-data-lake"
},
{
"ParameterKey": "SecurityGroupId",
"ParameterValue": "sg-0123456789abcdef0"
},
{
"ParameterKey": "SubnetId",
"ParameterValue": "subnet-0123456789abcdef0"
}
]# Set required environment variables
export DATA_LAKE_BUCKET=your-bucket-name
export BUCKET_PREFIX=omics-run-events
# Build and deploy
make cf-createThis command will:
- Create the Lambda deployment package with dependencies
- Upload the package to S3
- Create the CloudFormation stack with all resources
make cf-updateIf you prefer to deploy without make:
# 1. Create deployment package
rm -rf lambda-package
mkdir lambda-package
cd lambda-package
cp ../lambda.py .
pip3 install -r ../requirements.txt -t .
zip -r ../lambda-package.zip .
cd ..
# 2. Upload to S3
aws s3 cp lambda-package.zip s3://your-bucket/omics-run-events/lambda-package.zip --sse
# 3. Deploy CloudFormation stack
aws cloudformation create-stack \
--stack-name ngs360-omics-run-event-processor \
--template-body file://ngs360-omics-run-event-processor.yaml \
--capabilities CAPABILITY_IAM \
--parameters file://parameters.jsonThe Lambda function processes EventBridge events from AWS HealthOmics. To set up event routing:
- Create an EventBridge rule to trigger this Lambda function
- Configure the rule to filter for HealthOmics state change events
Example EventBridge rule pattern:
{
"source": ["aws.omics"],
"detail-type": ["Run Status Change"]
}Events from HealthOmics contain nested structures with run information, status, and metadata.
The flatten() function processes the nested JSON into a single-level dictionary for easier querying and analysis in downstream systems.
For all events, the Lambda function:
- Retrieves Tags: Gets the run tags from AWS HealthOmics, including the
WESRunIdtag which is used to link the run to the corresponding WES run ID
For events with status COMPLETED, FAILED, or CANCELLED, the Lambda function additionally adds:
- Log URLs: Links to CloudWatch logs for the run, tasks, and manifest
- Output File Mapping: For COMPLETED events, a mapping of output names to S3 URIs
Events are stored in S3 as:
s3://{DATA_LAKE_BUCKET}/{S3_PREFIX}/event_{YYYYMMDD_HHMMSS}_{UUID}.json
The function calls the NGS360 API endpoint:
POST {API_SERVER}/internal/callbacks/omics-state-change
Content-Type: application/json
The enhanced event JSON is sent as the request body with a 10-second timeout.
import json
from lambda import lambda_handler
# Load a sample event
with open('sample-event.json', 'r') as f:
event = json.load(f)
# Set required environment variables
import os
os.environ['API_SERVER'] = 'https://api.example.com'
os.environ['DATA_LAKE_BUCKET'] = 'test-bucket'
# Invoke handler
response = lambda_handler(event, None)
print(response)The Lambda function uses:
- boto3 (AWS SDK) - Pre-installed in Lambda runtime
- requests - HTTP client for API calls (see
requirements.txt)
Set VerboseLogging to true in stack parameters to enable DEBUG-level logging. By default, the function logs at INFO level.
Logs are written to: /aws/lambda/ngs360/omics-run-event-processor
Failed executions are sent to the SNS topic and email notifications are sent to the configured address.
- Lambda invocation count
- Lambda error rate
- Lambda duration
- DLQ message count
- S3 PutObject success/failure
The Lambda execution role has permissions for:
- VPC access (AWSLambdaVPCAccessExecutionRole)
- SNS publish (dead letter queue)
- S3 PutObject and GetObject (data lake storage)
- AWS HealthOmics API access (GetRun, ListRuns)
- π Lambda runs in VPC for network isolation
- π S3 objects encrypted with AES-256
- π IAM roles follow least-privilege principle
- π No sensitive data in logs (non-verbose mode)
Default timeout is 900 seconds (15 minutes). Adjust in CloudFormation template if needed.
Ensure the security group allows outbound HTTPS traffic to:
- S3 (via VPC endpoint or NAT Gateway)
- AWS HealthOmics API
- NGS360 API Server
Check CloudWatch Logs for HTTP error responses. Verify API_SERVER URL and network connectivity.
For COMPLETED events, check if the output mapping file exists at the expected S3 path:
s3://{bucket}/{prefix}/{run_id}/logs/outputs.json
.
βββ lambda.py # Lambda function code
βββ requirements.txt # Python dependencies
βββ ngs360-omics-run-event-processor.yaml # CloudFormation template
βββ Makefile # Build and deployment automation
βββ README.md # This file
Copyright Β© 2026 NGS360. All rights reserved.
For issues or questions, please contact the NGS360 development team or open an issue in this repository.