Terraform for Embryo Scoring and Reproductive Genomics
Provision reproductive-genomics ML infrastructure with Terraform: secure compute, data governance, ML pipelines, privacy controls, and regulated storage.
DevOps
Provision HIPAA-aligned genomics infrastructure with Terraform: secure data lakes, AWS HealthOmics workflows, audit logging, and compliant compute.
Gene editing and personalized medicine are reshaping 2026 healthcare. Sequencing is cheap; compliant compute is the bottleneck. Hospitals and biotechs need HIPAA-aligned data lakes, AWS HealthOmics workflow runners, audited access, and isolated environments per study. Terraform turns those building blocks into a reproducible "genomics stack."
This guide shows how to provision a personalized-medicine genomics backend on AWS.
| Layer | AWS service |
|---|---|
| Patient sequence storage | HealthOmics Sequence Stores |
| Variant storage | HealthOmics Variant Stores |
| Workflows | HealthOmics Workflows |
| Annotation lake | S3 + Glue + Athena |
| PHI access | IAM + Lake Formation + CloudTrail |
| Compute | Batch with EFA / FSx |
resource "aws_omics_sequence_store" "patient_seq" {
name = "patient-sequences"
description = "Primary patient FASTQ/BAM/CRAM"
sse_config {
type = "AWS_OWNED_KMS_KEY"
}
}
resource "aws_omics_variant_store" "germline" {
name = "germline-variants"
reference {
reference_arn = aws_omics_reference_store.grch38.arn
}
sse_config {
type = "KMS"
key_arn = aws_kms_key.phi.arn
}
}resource "aws_omics_workflow" "secondary_analysis" {
name = "germline-secondary-analysis"
description = "BWA-MEM2 + DeepVariant"
engine = "WDL"
storage_capacity = 1200
definition_uri = "s3://${aws_s3_bucket.workflows.bucket}/germline.zip"
parameter_template = jsonencode({
sample_id = { description = "Sample identifier", optional = false }
fastq_uris = { description = "FASTQ files", optional = false }
reference = { description = "Reference genome", optional = false }
})
}resource "aws_s3_bucket" "annotations" {
bucket = "acme-genomics-annotations"
}
resource "aws_lakeformation_resource" "annotations" {
arn = aws_s3_bucket.annotations.arn
role_arn = aws_iam_role.lake_formation.arn
}
resource "aws_glue_catalog_database" "genomics" {
name = "genomics"
}
resource "aws_lakeformation_permissions" "researcher_read" {
for_each = toset(var.researchers)
principal = each.value
permissions = ["SELECT"]
table_with_columns {
database_name = aws_glue_catalog_database.genomics.name
name = "variants"
excluded_column_names = ["patient_id", "mrn", "dob"]
}
}The exclusion list is the trick: researchers query variants without ever seeing PHI columns.
resource "aws_cloudtrail_event_data_store" "phi" {
name = "phi-audit"
multi_region_enabled = true
retention_period = 2557
termination_protection_enabled = true
advanced_event_selector {
name = "PHI bucket data events"
field_selector {
field = "eventCategory"
equals = ["Data"]
}
field_selector {
field = "resources.type"
equals = ["AWS::S3::Object"]
}
field_selector {
field = "resources.ARN"
starts_with = ["${aws_s3_bucket.phi.arn}/"]
}
}
}resource "aws_batch_compute_environment" "genomics" {
compute_environment_name = "genomics-secondary"
type = "MANAGED"
service_role = aws_iam_role.batch.arn
compute_resources {
type = "FARGATE"
max_vcpus = 4096
subnets = var.private_subnet_ids # no NAT gateway, only VPC endpoints
security_group_ids = [aws_security_group.batch_no_egress.id]
}
}data-classification=phi tag enforced via SCP.Provision reproductive-genomics ML infrastructure with Terraform: secure compute, data governance, ML pipelines, privacy controls, and regulated storage.
Provision SMR and advanced nuclear monitoring infrastructure with Terraform: digital twins, secure analytics, compliance workloads, and simulation environments.
Implement data sovereignty and geopatriation with Terraform on AWS. Enforce data residency with SCPs, deploy region-locked infrastructure
Deploy OpenClaw AI on AWS EC2 with Terraform: Ubuntu 24.04, gp3 EBS for persistent agent data, SSH key pair, security group, and user-data bootstrap.