Skip to main content

Terraform Data Sources Explained: Read External Information

Key Takeaway

Learn Terraform data sources to read existing AWS resources, look up AMIs, query remote state, and reference external information in your configurations.

Table of Contents

Data sources let Terraform read information from your cloud provider or external systems without creating or managing resources. Use them to look up AMIs, reference existing VPCs, read secrets, and query anything you didn’t create with Terraform.

Basic Syntax

# Data source: reads, never creates
data "aws_ami" "al2023" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }
}

# Use the result
resource "aws_instance" "web" {
  ami           = data.aws_ami.al2023.id    # ← Read from data source
  instance_type = "t3.micro"
}

Data Source vs Resource

Featureresourcedata
ActionCreate, update, destroyRead only
StateTracked in stateRefreshed every plan
LifecycleFull (create → destroy)None (read each time)
PurposeManage infrastructureReference existing infrastructure
Prefixresource "aws_vpc"data "aws_vpc"
Referenceaws_vpc.main.iddata.aws_vpc.main.id

Common Data Sources

Look Up the Latest AMI

# Amazon Linux 2023
data "aws_ami" "al2023" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# Ubuntu 24.04
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*"]
  }
}

Reference an Existing VPC

# Look up VPC by tag
data "aws_vpc" "existing" {
  filter {
    name   = "tag:Name"
    values = ["production-vpc"]
  }
}

# Look up subnets in that VPC
data "aws_subnets" "private" {
  filter {
    name   = "vpc-id"
    values = [data.aws_vpc.existing.id]
  }

  filter {
    name   = "tag:Tier"
    values = ["private"]
  }
}

# Use them
resource "aws_ecs_service" "app" {
  # ...
  network_configuration {
    subnets = data.aws_subnets.private.ids
  }
}

Get Current AWS Account Info

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

locals {
  account_id = data.aws_caller_identity.current.account_id  # "123456789012"
  region     = data.aws_region.current.name                  # "us-east-1"
}

# Use in ARN construction
resource "aws_iam_policy" "app" {
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "s3:GetObject"
      Resource = "arn:aws:s3:::${local.account_id}-data/*"
    }]
  })
}

Get Available AZs

data "aws_availability_zones" "available" {
  state = "available"
}

resource "aws_subnet" "private" {
  count             = 3
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet("10.0.0.0/16", 8, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

Read from Secrets Manager

data "aws_secretsmanager_secret_version" "db" {
  secret_id = "prod/database/password"
}

resource "aws_db_instance" "main" {
  # ...
  password = data.aws_secretsmanager_secret_version.db.secret_string
  # ⚠️ Note: this stores the secret in Terraform state!
  # Consider ephemeral resources for secrets (Terraform 1.10+)
}

Look Up an IAM Role

data "aws_iam_role" "existing" {
  name = "AWSServiceRoleForECS"
}

resource "aws_ecs_service" "app" {
  # ...
  iam_role = data.aws_iam_role.existing.arn
}

Read SSM Parameters

data "aws_ssm_parameter" "db_endpoint" {
  name = "/prod/database/endpoint"
}

resource "aws_ecs_task_definition" "app" {
  container_definitions = jsonencode([{
    name = "app"
    environment = [{
      name  = "DB_HOST"
      value = data.aws_ssm_parameter.db_endpoint.value
    }]
  }])
}

Get Route53 Zone

data "aws_route53_zone" "main" {
  name = "example.com"
}

resource "aws_route53_record" "www" {
  zone_id = data.aws_route53_zone.main.zone_id
  name    = "www.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.main.dns_name
    zone_id                = aws_lb.main.zone_id
    evaluate_target_health = true
  }
}

Data Sources with for_each

variable "secret_names" {
  default = ["db-password", "api-key", "jwt-secret"]
}

data "aws_secretsmanager_secret_version" "secrets" {
  for_each  = toset(var.secret_names)
  secret_id = each.key
}

# Access: data.aws_secretsmanager_secret_version.secrets["db-password"].secret_string

Remote State as Data Source

Read outputs from another Terraform project:

data "terraform_remote_state" "networking" {
  backend = "s3"

  config = {
    bucket = "tf-state-bucket"
    key    = "networking/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "web" {
  subnet_id = data.terraform_remote_state.networking.outputs.public_subnet_ids[0]
  # ...
}

Note: Prefer Terraform Stacks or module outputs over remote state when possible.

External Data Source

Run a script and use its output:

data "external" "git_hash" {
  program = ["bash", "-c", "echo '{\"hash\": \"'$(git rev-parse --short HEAD)'\"}'"]
}

resource "aws_instance" "web" {
  tags = {
    GitCommit = data.external.git_hash.result["hash"]
  }
}

Tips

  1. Data sources refresh every plan — they always show current state
  2. Use filters, not IDsfilter blocks are more maintainable than hardcoded IDs
  3. Data sources can fail — if the resource doesn’t exist, terraform plan errors
  4. Secrets in state — data sources store results in state; use ephemeral resources for secrets

Hands-On Courses

Conclusion

Data sources bridge Terraform-managed and non-Terraform infrastructure. Use them to look up AMIs, reference existing VPCs, read secrets, and query account metadata. The most common pattern is data "aws_ami" for dynamic AMI lookup and data "aws_vpc" / data "aws_subnets" for referencing networking created by another team. Remember: data sources read, never create.

🚀

Level Up Your Terraform Skills

Hands-on courses, books, and resources from Luca Berton

Luca Berton
Written by

Luca Berton

DevOps Engineer, AWS Partner, Terraform expert, and author. Creator of Ansible Pilot, Terraform Pilot, and CopyPasteLearn.