Terraform for AI Infrastructure Optimization: Cost-Efficient Model Deployment on AWS
Optimize AI infrastructure costs with Terraform. Deploy right-sized inference endpoints, auto-scale based on token throughput, use Spot instances
DevOps
Estimate infrastructure costs before terraform apply with Infracost. See cost diffs in pull requests, set budget policies
You run terraform apply and get a $500/month surprise on next month's AWS bill. Nobody reviewed the cost before deploying.
Infracost shows you the cost of your Terraform changes before you apply them:
$ infracost breakdown --path .
Name Monthly Cost
aws_instance.web
├─ Instance usage (t3.large, on-demand) $60.74
├─ root_block_device (gp3, 50 GB) $4.00
└─ CPU credits $0.00
aws_db_instance.main
├─ Database instance (db.r6g.large) $131.40
└─ Storage (gp3, 100 GB) $11.50
aws_nat_gateway.main
├─ NAT gateway $32.85
└─ Data processed (100 GB) $4.50
OVERALL TOTAL $244.99# macOS
brew install infracost
# Linux
curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
# Docker
docker run -it --rm infracost/infracost breakdown --path /codeGet a free API key:
infracost auth login
# Opens browser → sign up → key saved to ~/.config/infracost/credentials.yml# Full cost breakdown
infracost breakdown --path .
# JSON output
infracost breakdown --path . --format json
# Specific Terraform plan
terraform plan -out=plan.tfplan
infracost breakdown --path plan.tfplanSee cost impact of your changes:
# Compare current branch to main
infracost diff --path . --compare-to main Monthly cost will increase by $156.00
+ aws_instance.api
+$60.74 (new resource)
~ aws_db_instance.main
+$95.26 (db.r6g.large → db.r6g.xlarge)
Monthly cost: $244.99 → $400.99infracost:
stage: validate
image:
name: infracost/infracost:ci-0.10
entrypoint: [""]
script:
- infracost breakdown --path . --format json --out-file infracost.json
- infracost output --path infracost.json --format gitlab-comment --out-file comment.md
- |
# Post comment to MR
curl -X POST \
-H "PRIVATE-TOKEN: ${GITLAB_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"body\": \"$(cat comment.md | jq -Rs .)\"}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"- name: Infracost
uses: infracost/actions/setup@v3
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- name: Generate cost estimate
run: |
infracost breakdown --path . --format json --out-file /tmp/infracost.json
infracost comment github --path /tmp/infracost.json \
--repo $GITHUB_REPOSITORY \
--pull-request ${{ github.event.pull_request.number }} \
--github-token ${{ github.token }}This posts a cost table directly in your pull request.
# infracost.yml
version: 0.1
projects:
- path: .
policies:
- name: "Monthly cost limit"
description: "Total monthly cost must be under $1000"
resource_type: "*"
condition:
monthly_cost: "<= 1000"infracost breakdown --path . --format json | infracost policy check --policy-file infracost.ymlTag resources with cost centers:
resource "aws_instance" "web" {
instance_type = "t3.large"
tags = {
CostCenter = "engineering"
Project = "api"
}
}Then track costs by tag in your FinOps dashboard.
# Check: are you over-provisioned?
variable "instance_type" {
type = string
default = "t3.micro" # Start small, scale up based on metrics
validation {
condition = contains(["t3.micro", "t3.small", "t3.medium", "t3.large"], var.instance_type)
error_message = "Use approved instance types only."
}
}resource "aws_spot_instance_request" "worker" {
ami = var.ami_id
instance_type = "c5.xlarge"
spot_price = "0.08" # 60-90% savings
wait_for_fulfillment = true
instance_interruption_behavior = "stop"
}# Tag dev/staging for auto-shutdown
resource "aws_instance" "dev" {
tags = {
AutoStop = "true"
Schedule = "office-hours" # Used by AWS Instance Scheduler
}
}| Pricing | t3.large monthly | Savings |
|---|---|---|
| On-Demand | $60.74 | — |
| 1-Year Reserved (No Upfront) | $38.69 | 36% |
| 1-Year Reserved (All Upfront) | $35.04 | 42% |
| 3-Year Reserved (All Upfront) | $22.34 | 63% |
# gp3 is 20% cheaper with better baseline performance
root_block_device {
volume_type = "gp3" # Not "gp2"
volume_size = 50
}| Resource | Typical Monthly Cost |
|---|---|
| t3.micro | $7.59 |
| t3.large | $60.74 |
| NAT Gateway | $32.85 + data |
| ALB | $16.43 + LCU |
| RDS db.t3.medium (Postgres) | $49.06 |
| RDS db.r6g.large (Postgres) | $131.40 |
| S3 (100 GB) | $2.30 |
| EBS gp3 (100 GB) | $8.00 |
| ElastiCache cache.t3.medium | $46.72 |
| Lambda (1M requests) | $0.20 |
Biggest surprises: NAT Gateways (data transfer costs), ALBs, cross-AZ traffic.
Learn by doing with interactive courses on CopyPasteLearn:
Install Infracost, add it to your CI pipeline, and see cost diffs on every pull request. Right-size instances, use gp3 over gp2, schedule non-production shutdowns, and watch out for NAT Gateway data transfer costs. The goal: no surprise AWS bills after terraform apply.
Optimize AI infrastructure costs with Terraform. Deploy right-sized inference endpoints, auto-scale based on token throughput, use Spot instances
Install and run Terraform on Ubuntu 26.04 LTS Resolute Raccoon. Covers sudo-rs as default, APT 3.2 rollback, Kernel 7.0, Wayland-only, ROCm, and building...
Set up OCI Load Balancer with Terraform — backend sets, listeners, SSL certificates, and health checks. Step-by-step guide with code examples and best practi...
Configure OCI Object Storage buckets with Terraform — lifecycle policies, pre-authenticated requests, and replication. Step-by-step guide with code examples ...