TerraformPilot

Troubleshooting

Fix Terraform MSK Cluster - TooManyRequestsException

Fix AWS MSK cluster throttling errors in Terraform. Handle API rate limits, retry configuration, reduce parallelism, and manage long cluster creation times.

LLuca Berton1 min read

Quick Answer

#

AWS is throttling your API requests for MSK. Reduce Terraform's parallelism, add retry logic, wait between operations, or check if you've hit the MSK cluster limit for the region.

The Error

#
Error: creating MSK Cluster (kafka-prod):
  TooManyRequestsException: Rate exceeded
Error: waiting for MSK Cluster (kafka-prod) to create:
  TooManyRequestsException: Too many requests

What Causes This Error

#
  1. API rate limiting — too many MSK API calls in a short period
  2. Cluster limit reached — default limit is 3-5 MSK clusters per region
  3. Concurrent operations — Terraform running multiple MSK operations in parallel
  4. Multiple terraform apply running simultaneously

How to Fix It

#

Solution 1: Reduce Parallelism

#
# Reduce concurrent operations
terraform apply -parallelism=2
 
# For MSK-heavy configs, go even lower
terraform apply -parallelism=1

Solution 2: Add Timeouts

#

MSK clusters take 20-45 minutes to create:

resource "aws_msk_cluster" "main" {
  cluster_name           = "${var.project}-${var.environment}"
  kafka_version          = "3.5.1"
  number_of_broker_nodes = 3
 
  broker_node_group_info {
    instance_type  = "kafka.m5.large"
    client_subnets = var.private_subnet_ids
    storage_info {
      ebs_storage_info {
        volume_size = 100
      }
    }
    security_groups = [aws_security_group.msk.id]
  }
 
  timeouts {
    create = "60m"
    update = "60m"
    delete = "60m"
  }
}

Solution 3: Request Quota Increase

#
# Check current MSK limits
aws service-quotas get-service-quota \
  --service-code kafka \
  --quota-code L-01onal-cluster-count
 
# Request increase via Console:
# Service Quotas → Amazon MSK → Number of clusters

Solution 4: Use -target for Sequential Creation

#
# Create MSK cluster first
terraform apply -target=aws_msk_cluster.main
 
# Then create dependent resources
terraform apply

Troubleshooting Checklist

#
  1. ✅ Are you hitting API rate limits? (Reduce parallelism)
  2. ✅ How many MSK clusters exist in the region? (Check quota)
  3. ✅ Are multiple terraform applies running concurrently?
  4. ✅ Have you set sufficient timeouts for MSK cluster creation?

Prevention Tips

#
  • Use -parallelism=2 for configs with MSK resources
  • Set 60-minute timeouts — MSK clusters are slow to provision
  • Request quota increases proactively before large deployments
  • Use -target to create MSK clusters separately from other resources
#

Conclusion

#

MSK TooManyRequestsException is an API throttling error. Reduce parallelism, add generous timeouts, and check your cluster quota. MSK clusters are among the slowest AWS resources to provision — plan accordingly.

#Terraform#AWS#Troubleshooting#Error Fix

Share this article