Terraform for AI Companions: Real-Time Voice and Chat Backends
Provision AI companion infrastructure with Terraform: real-time inference APIs, voice infrastructure, user data stores, moderation, and scaling policies.
DevOps
Provision domain-specific LLM infrastructure with Terraform: GPU inference endpoints, private data stores, fine-tuning pipelines, and isolated environments.
Domain-specific language models are a 2026 trend reshaping enterprise AI. Instead of one giant general-purpose model, organizations fine-tune smaller models on legal, medical, financial, or industrial corpora — and run them in compliance-isolated environments. Terraform provisions the GPU endpoints, private data stores, and fine-tuning pipelines that make this repeatable.
This guide shows how to build a domain-specific LLM platform on AWS with Terraform.
| Layer | AWS service |
|---|---|
| Private data lake | S3 + Lake Formation |
| Fine-tuning | SageMaker Training Jobs |
| Model registry | SageMaker Model Registry |
| Inference | SageMaker async / serverless / real-time endpoints |
| Isolation | VPC endpoints, KMS, IAM, PrivateLink |
resource "aws_s3_bucket" "training_corpus" {
bucket = "acme-legal-corpus-${var.env}"
}
resource "aws_s3_bucket_server_side_encryption_configuration" "corpus" {
bucket = aws_s3_bucket.training_corpus.id
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = aws_kms_key.corpus.arn
sse_algorithm = "aws:kms"
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_public_access_block" "corpus" {
bucket = aws_s3_bucket.training_corpus.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_kms_key" "corpus" {
description = "Encrypts domain-specific training data"
enable_key_rotation = true
}resource "aws_sagemaker_training_job" "legal_finetune" {
training_job_name = "legal-llm-${var.run_id}"
role_arn = aws_iam_role.sagemaker.arn
algorithm_specification {
training_image = "763104351884.dkr.ecr.${var.region}.amazonaws.com/huggingface-pytorch-training:2.3-transformers4.46-gpu-py311-cu124-ubuntu22.04"
training_input_mode = "File"
}
resource_config {
instance_type = "ml.p4d.24xlarge"
instance_count = 2
volume_size_in_gb = 500
}
input_data_config {
channel_name = "train"
data_source {
s3_data_source {
s3_data_type = "S3Prefix"
s3_uri = "s3://${aws_s3_bucket.training_corpus.bucket}/processed/"
}
}
}
output_data_config {
s3_output_path = "s3://${aws_s3_bucket.models.bucket}/output/"
kms_key_id = aws_kms_key.corpus.arn
}
vpc_config {
subnets = var.private_subnet_ids
security_group_ids = [aws_security_group.sagemaker.id]
}
enable_inter_container_traffic_encryption = true
enable_network_isolation = true
}resource "aws_sagemaker_model" "legal_llm" {
name = "legal-llm-v1"
execution_role_arn = aws_iam_role.sagemaker.arn
primary_container {
image = var.inference_image
model_data_url = "s3://${aws_s3_bucket.models.bucket}/output/model.tar.gz"
}
vpc_config {
subnets = var.private_subnet_ids
security_group_ids = [aws_security_group.sagemaker.id]
}
}
resource "aws_sagemaker_endpoint_configuration" "legal_llm" {
name = "legal-llm-config"
production_variants {
variant_name = "AllTraffic"
model_name = aws_sagemaker_model.legal_llm.name
instance_type = "ml.g5.12xlarge"
initial_instance_count = 2
}
kms_key_arn = aws_kms_key.corpus.arn
}
resource "aws_sagemaker_endpoint" "legal_llm" {
name = "legal-llm"
endpoint_config_name = aws_sagemaker_endpoint_configuration.legal_llm.name
}resource "aws_vpc_endpoint" "sagemaker_runtime" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.sagemaker.runtime"
vpc_endpoint_type = "Interface"
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.endpoints.id]
private_dns_enabled = true
}Provision AI companion infrastructure with Terraform: real-time inference APIs, voice infrastructure, user data stores, moderation, and scaling policies.
Provision AI-native developer platforms with Terraform: sandboxes, CI/CD runners, model-serving environments, secrets, VPCs, and preview environments.
Standardize hyperscale AI data center infrastructure with Terraform: multi-region modules, capacity blocks, GPU pools, and repeatable region rollouts.
Provision mechanistic interpretability research infrastructure with Terraform: research compute, experiment tracking, model checkpoints, and notebooks.