Infraestructura como Código: Automatiza y Gestiona tu Infraestructura de Forma Eficiente
La Infraestructura como Código (IaC) ha revolucionado la manera en que las organizaciones modernas gestionan y despliegan su infraestructura. Este paradigma transforma la administración de sistemas desde procesos manuales propensos a errores hacia un enfoque automatizado, versionado y repetible que trata la infraestructura con los mismos principios que el desarrollo de software.
En esta guía comprehensiva, exploraremos cómo implementar IaC efectivamente, desde conceptos fundamentales hasta patrones empresariales avanzados, utilizando las mejores herramientas y prácticas del ecosistema.
Fundamentos de la Infraestructura como Código
Definición y Principios Fundamentales
La Infraestructura como Código es una metodología que gestiona y aprovisiona infraestructura través de archivos de definición legibles por máquina, en lugar de configuración física de hardware o herramientas de configuración interactivas.
Principios fundamentales:
Declarativo vs Imperativo: Define el estado deseado, no los pasos para alcanzarlo. El sistema se encarga de determinar las acciones necesarias.
Inmutabilidad: Los recursos no se modifican directamente en producción. Los cambios se realizan mediante nuevas versiones de la definición.
Idempotencia: Ejecutar la misma configuración múltiples veces produce el mismo resultado, sin efectos secundarios.
Control de versiones: Todo cambio queda registrado, permitiendo trazabilidad completa y rollbacks seguros.
Beneficios Empresariales
Reducción de costos operacionales: Automatización elimina tareas manuales repetitivas, liberando tiempo para actividades de mayor valor.
Consistencia entre entornos: Desarrollo, staging y producción utilizan las mismas definiciones, eliminando el “funciona en mi máquina”.
Velocidad de despliegue: Provisioning de entornos completos en minutos en lugar de días o semanas.
Cumplimiento y auditoría: Todo cambio queda documentado y puede ser auditado automáticamente.
Recuperación ante desastres: Infraestructura completa puede ser recreada desde código fuente.
Herramientas del Ecosistema IaC
Terraform: El Estándar de Facto
Terraform de HashiCorp se ha posicionado como la herramienta líder para IaC multi-cloud. Su sintaxis declarativa HCL (HashiCorp Configuration Language) permite definir recursos de manera intuitiva.
# Ejemplo: Infraestructura web básica en AWS
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
}
}
}
# VPC y subredes
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-vpc-${var.environment}"
}
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.project_name}-igw-${var.environment}"
}
}
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.project_name}-public-${var.availability_zones[count.index]}"
Type = "public"
}
}
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + length(var.availability_zones))
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.project_name}-private-${var.availability_zones[count.index]}"
Type = "private"
}
}
# Tabla de rutas
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.project_name}-public-rt"
}
}
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.project_name}-alb-${var.environment}"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
enable_deletion_protection = var.environment == "production" ? true : false
tags = {
Name = "${var.project_name}-alb-${var.environment}"
}
}
# Security Groups
resource "aws_security_group" "alb" {
name_prefix = "${var.project_name}-alb-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-alb-sg"
}
}
resource "aws_security_group" "app" {
name_prefix = "${var.project_name}-app-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from ALB"
from_port = var.app_port
to_port = var.app_port
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-app-sg"
}
}
Configuración de Variables y Outputs
# variables.tf
variable "aws_region" {
description = "AWS region"
type = string
default = "us-west-2"
}
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["development", "staging", "production"], var.environment)
error_message = "Environment must be development, staging, or production."
}
}
variable "project_name" {
description = "Project name for resource naming"
type = string
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "availability_zones" {
description = "Availability zones"
type = list(string)
default = ["us-west-2a", "us-west-2b", "us-west-2c"]
}
variable "app_port" {
description = "Application port"
type = number
default = 3000
}
# outputs.tf
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "IDs of the private subnets"
value = aws_subnet.private[*].id
}
output "load_balancer_dns" {
description = "DNS name of the load balancer"
value = aws_lb.main.dns_name
}
output "security_group_app_id" {
description = "ID of the application security group"
value = aws_security_group.app.id
}
Gestión de Estado Remoto
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "environments/production/terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}
Patrones Avanzados de IaC
Modularización y Reutilización
# modules/web-app/main.tf
resource "aws_launch_template" "app" {
name_prefix = "${var.name}-"
image_id = var.ami_id
instance_type = var.instance_type
key_name = var.key_name
vpc_security_group_ids = var.security_group_ids
user_data = base64encode(templatefile("${path.module}/user-data.sh", {
app_port = var.app_port
app_name = var.name
}))
tag_specifications {
resource_type = "instance"
tags = merge(var.common_tags, {
Name = "${var.name}-instance"
})
}
lifecycle {
create_before_destroy = true
}
}
resource "aws_autoscaling_group" "app" {
name = "${var.name}-asg"
vpc_zone_identifier = var.subnet_ids
target_group_arns = [aws_lb_target_group.app.arn]
health_check_type = "ELB"
min_size = var.min_size
max_size = var.max_size
desired_capacity = var.desired_capacity
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
}
tag {
key = "Name"
value = "${var.name}-asg"
propagate_at_launch = false
}
dynamic "tag" {
for_each = var.common_tags
content {
key = tag.key
value = tag.value
propagate_at_launch = true
}
}
}
resource "aws_lb_target_group" "app" {
name = "${var.name}-tg"
port = var.app_port
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
interval = 30
matcher = "200"
path = var.health_check_path
port = "traffic-port"
protocol = "HTTP"
timeout = 5
unhealthy_threshold = 2
}
tags = var.common_tags
}
# Auto Scaling Policies
resource "aws_autoscaling_policy" "scale_up" {
name = "${var.name}-scale-up"
scaling_adjustment = 1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.app.name
}
resource "aws_autoscaling_policy" "scale_down" {
name = "${var.name}-scale-down"
scaling_adjustment = -1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.app.name
}
# CloudWatch Alarms
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
alarm_name = "${var.name}-cpu-high"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ec2 cpu utilization"
alarm_actions = [aws_autoscaling_policy.scale_up.arn]
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.app.name
}
}
resource "aws_cloudwatch_metric_alarm" "cpu_low" {
alarm_name = "${var.name}-cpu-low"
comparison_operator = "LessThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "10"
alarm_description = "This metric monitors ec2 cpu utilization"
alarm_actions = [aws_autoscaling_policy.scale_down.arn]
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.app.name
}
}
Configuración Multi-Entorno
# environments/production/main.tf
module "vpc" {
source = "../../modules/vpc"
project_name = "myapp"
environment = "production"
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-west-2a", "us-west-2b", "us-west-2c"]
common_tags = local.common_tags
}
module "web_app" {
source = "../../modules/web-app"
name = "myapp-web"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
security_group_ids = [module.vpc.app_security_group_id]
instance_type = "c5.large"
min_size = 3
max_size = 10
desired_capacity = 5
app_port = 3000
health_check_path = "/health"
common_tags = local.common_tags
}
locals {
common_tags = {
Environment = "production"
Project = "myapp"
Owner = "platform-team"
CostCenter = "engineering"
ManagedBy = "terraform"
}
}
Herramientas Complementarias
AWS CDK: Infraestructura con Lenguajes de Programación
// AWS CDK con TypeScript
import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';
import * as logs from 'aws-cdk-lib/aws-logs';
import { Construct } from 'constructs';
export class WebApplicationStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// VPC
const vpc = new ec2.Vpc(this, 'VPC', {
maxAzs: 3,
natGateways: 3,
cidr: '10.0.0.0/16',
subnetConfiguration: [
{
cidrMask: 24,
name: 'public',
subnetType: ec2.SubnetType.PUBLIC,
},
{
cidrMask: 24,
name: 'private',
subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
},
{
cidrMask: 28,
name: 'database',
subnetType: ec2.SubnetType.PRIVATE_ISOLATED,
},
],
});
// ECS Cluster
const cluster = new ecs.Cluster(this, 'Cluster', {
vpc,
containerInsights: true,
clusterName: 'web-application-cluster',
});
// Application Load Balancer
const alb = new elbv2.ApplicationLoadBalancer(this, 'ALB', {
vpc,
internetFacing: true,
});
// Task Definition
const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDef', {
memoryLimitMiB: 2048,
cpu: 1024,
});
const logGroup = new logs.LogGroup(this, 'LogGroup', {
retention: logs.RetentionDays.ONE_WEEK,
});
taskDefinition.addContainer('WebContainer', {
image: ecs.ContainerImage.fromRegistry('nginx:alpine'),
portMappings: [{ containerPort: 80 }],
logging: ecs.LogDrivers.awsLogs({
streamPrefix: 'web-app',
logGroup,
}),
});
// ECS Service
const service = new ecs.FargateService(this, 'Service', {
cluster,
taskDefinition,
desiredCount: 3,
assignPublicIp: false,
});
// Target Group
const targetGroup = new elbv2.ApplicationTargetGroup(this, 'TargetGroup', {
port: 80,
vpc,
protocol: elbv2.ApplicationProtocol.HTTP,
targetType: elbv2.TargetType.IP,
healthCheck: {
path: '/',
healthyHttpCodes: '200',
},
});
service.attachToApplicationTargetGroup(targetGroup);
// Listener
alb.addListener('Listener', {
port: 80,
defaultTargetGroups: [targetGroup],
});
// Auto Scaling
const scalableTarget = service.autoScaleTaskCount({
minCapacity: 2,
maxCapacity: 10,
});
scalableTarget.scaleOnCpuUtilization('CpuScaling', {
targetUtilizationPercent: 70,
});
// Outputs
new cdk.CfnOutput(this, 'LoadBalancerDNS', {
value: alb.loadBalancerDnsName,
});
}
}
Pulumi: IaC con Programación Orientada a Objetos
# Pulumi con Python
import pulumi
import pulumi_aws as aws
from pulumi_aws import ec2, ecs, elasticloadbalancingv2 as elbv2
# Configuration
config = pulumi.Config()
project_name = pulumi.get_project()
stack_name = pulumi.get_stack()
# VPC
vpc = ec2.Vpc("vpc",
cidr_block="10.0.0.0/16",
enable_dns_hostnames=True,
enable_dns_support=True,
tags={
"Name": f"{project_name}-vpc-{stack_name}",
"Project": project_name,
"Stack": stack_name,
})
# Internet Gateway
igw = ec2.InternetGateway("igw",
vpc_id=vpc.id,
tags={"Name": f"{project_name}-igw"})
# Subnets
availability_zones = aws.get_availability_zones().names
public_subnets = []
private_subnets = []
for i, az in enumerate(availability_zones[:3]):
# Public subnet
public_subnet = ec2.Subnet(f"public-subnet-{i}",
vpc_id=vpc.id,
cidr_block=f"10.0.{i}.0/24",
availability_zone=az,
map_public_ip_on_launch=True,
tags={
"Name": f"{project_name}-public-{az}",
"Type": "public",
})
public_subnets.append(public_subnet)
# Private subnet
private_subnet = ec2.Subnet(f"private-subnet-{i}",
vpc_id=vpc.id,
cidr_block=f"10.0.{i + 10}.0/24",
availability_zone=az,
tags={
"Name": f"{project_name}-private-{az}",
"Type": "private",
})
private_subnets.append(private_subnet)
# Route Tables
public_rt = ec2.RouteTable("public-rt",
vpc_id=vpc.id,
routes=[{
"cidr_block": "0.0.0.0/0",
"gateway_id": igw.id,
}],
tags={"Name": f"{project_name}-public-rt"})
for i, subnet in enumerate(public_subnets):
ec2.RouteTableAssociation(f"public-rt-assoc-{i}",
subnet_id=subnet.id,
route_table_id=public_rt.id)
# Security Groups
class SecurityGroupBuilder:
@staticmethod
def create_alb_sg(vpc_id: pulumi.Input[str]) -> ec2.SecurityGroup:
return ec2.SecurityGroup("alb-sg",
vpc_id=vpc_id,
description="Security group for Application Load Balancer",
ingress=[
{"protocol": "tcp", "from_port": 80, "to_port": 80, "cidr_blocks": ["0.0.0.0/0"]},
{"protocol": "tcp", "from_port": 443, "to_port": 443, "cidr_blocks": ["0.0.0.0/0"]},
],
egress=[{"protocol": "-1", "from_port": 0, "to_port": 0, "cidr_blocks": ["0.0.0.0/0"]}],
tags={"Name": f"{project_name}-alb-sg"})
@staticmethod
def create_app_sg(vpc_id: pulumi.Input[str], alb_sg_id: pulumi.Input[str]) -> ec2.SecurityGroup:
return ec2.SecurityGroup("app-sg",
vpc_id=vpc_id,
description="Security group for application",
ingress=[{
"protocol": "tcp",
"from_port": 3000,
"to_port": 3000,
"security_groups": [alb_sg_id],
}],
egress=[{"protocol": "-1", "from_port": 0, "to_port": 0, "cidr_blocks": ["0.0.0.0/0"]}],
tags={"Name": f"{project_name}-app-sg"})
alb_sg = SecurityGroupBuilder.create_alb_sg(vpc.id)
app_sg = SecurityGroupBuilder.create_app_sg(vpc.id, alb_sg.id)
# Application Load Balancer
alb = elbv2.LoadBalancer("alb",
load_balancer_type="application",
security_groups=[alb_sg.id],
subnets=[subnet.id for subnet in public_subnets],
tags={"Name": f"{project_name}-alb"})
# Exports
pulumi.export("vpc_id", vpc.id)
pulumi.export("public_subnet_ids", [subnet.id for subnet in public_subnets])
pulumi.export("private_subnet_ids", [subnet.id for subnet in private_subnets])
pulumi.export("alb_dns_name", alb.dns_name)
GitOps e Integración CI/CD
Pipeline de Terraform con GitHub Actions
# .github/workflows/terraform.yml
name: Terraform CI/CD
on:
push:
branches: [main, develop]
paths: ['terraform/**']
pull_request:
branches: [main]
paths: ['terraform/**']
env:
TF_VERSION: '1.6.0'
AWS_REGION: 'us-west-2'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Format Check
run: terraform fmt -check -recursive terraform/
- name: Terraform Init
run: |
cd terraform/environments/staging
terraform init -backend=false
- name: Terraform Validate
run: |
cd terraform/environments/staging
terraform validate
- name: Run TFSec Security Scan
uses: aquasecurity/tfsec-action@v1.0.0
with:
working_directory: terraform/
- name: Run Checkov Security Scan
uses: bridgecrewio/checkov-action@master
with:
directory: terraform/
quiet: true
framework: terraform
plan:
runs-on: ubuntu-latest
needs: validate
if: github.event_name == 'pull_request'
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Init
run: |
cd terraform/environments/staging
terraform init
- name: Terraform Plan
run: |
cd terraform/environments/staging
terraform plan -no-color -out=tfplan
- name: Save Plan
uses: actions/upload-artifact@v3
with:
name: terraform-plan
path: terraform/environments/staging/tfplan
- name: Comment Plan on PR
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const fs = require('fs');
const { execSync } = require('child_process');
try {
const planOutput = execSync('cd terraform/environments/staging && terraform show -no-color tfplan',
{ encoding: 'utf-8', maxBuffer: 1024 * 1024 });
const comment = `## Terraform Plan Results
<details><summary>Show Plan</summary>
\`\`\`hcl
${planOutput}
\`\`\`
</details>
Plan generated for commit: ${context.sha}`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: comment
});
} catch (error) {
console.error('Error posting plan comment:', error);
}
apply:
runs-on: ubuntu-latest
needs: validate
if: github.ref == 'refs/heads/main'
environment: production
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Init
run: |
cd terraform/environments/production
terraform init
- name: Terraform Apply
run: |
cd terraform/environments/production
terraform apply -auto-approve
- name: Update Infrastructure Documentation
run: |
cd terraform/environments/production
terraform output -json > ../../../docs/infrastructure-outputs.json
- name: Notify Slack
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: |
Infrastructure deployment ${{ job.status }}!
Environment: Production
Commit: ${{ github.sha }}
Actor: ${{ github.actor }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
if: always()
Mejores Prácticas y Patrones Empresariales
Testing de Infraestructura
// Terratest con Go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/stretchr/testify/assert"
)
func TestTerraformInfrastructure(t *testing.T) {
t.Parallel()
// Configuración de Terraform
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../terraform/environments/test",
Vars: map[string]interface{}{
"environment": "test",
"project_name": "terratest",
"instance_type": "t3.micro",
},
})
// Limpiar recursos después del test
defer terraform.Destroy(t, terraformOptions)
// Aplicar configuración de Terraform
terraform.InitAndApply(t, terraformOptions)
// Obtener outputs
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
albDns := terraform.Output(t, terraformOptions, "load_balancer_dns")
// Validaciones
assert.NotEmpty(t, vpcId)
assert.NotEmpty(t, albDns)
// Verificar que la VPC existe en AWS
awsRegion := "us-west-2"
aws.GetVpcById(t, vpcId, awsRegion)
// Verificar que el ALB responde
url := "http://" + albDns
http_helper.HttpGetWithRetry(t, url, nil, 200, "nginx", 30, 5*time.Second)
}
func TestSecurityGroups(t *testing.T) {
t.Parallel()
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../terraform/modules/security-groups",
Vars: map[string]interface{}{
"vpc_id": "vpc-12345678",
},
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Verificar reglas de security group
sgId := terraform.Output(t, terraformOptions, "app_security_group_id")
awsRegion := "us-west-2"
sg := aws.GetSecurityGroupById(t, sgId, awsRegion)
// Verificar que solo permite tráfico del ALB
assert.Len(t, sg.GroupRules, 2) // ingress + egress
}
Policy as Code con Open Policy Agent
# policies/terraform.rego
package terraform.security
import future.keywords.in
import future.keywords.if
# Denegar instancias grandes en entornos no productivos
deny_large_instances[msg] {
input.planned_values.root_module.resources[i].type == "aws_instance"
instance := input.planned_values.root_module.resources[i]
large_instance_types := {
"m5.large", "m5.xlarge", "m5.2xlarge",
"c5.large", "c5.xlarge", "c5.2xlarge"
}
instance.values.instance_type in large_instance_types
# Obtener environment de las tags
environment := instance.values.tags.Environment
environment != "production"
msg := sprintf("Large instance type '%s' not allowed in '%s' environment", [
instance.values.instance_type,
environment
])
}
# Requerir cifrado en volúmenes EBS
deny_unencrypted_ebs[msg] {
input.planned_values.root_module.resources[i].type == "aws_ebs_volume"
volume := input.planned_values.root_module.resources[i]
not volume.values.encrypted
msg := sprintf("EBS volume '%s' must be encrypted", [volume.address])
}
# Requerir tags obligatorias
required_tags := {"Environment", "Project", "Owner", "CostCenter"}
deny_missing_tags[msg] {
input.planned_values.root_module.resources[i].type in {
"aws_instance", "aws_ebs_volume", "aws_s3_bucket"
}
resource := input.planned_values.root_module.resources[i]
resource_tags := object.get(resource.values, "tags", {})
missing_tag := required_tags[_]
not missing_tag in object.keys(resource_tags)
msg := sprintf("Resource '%s' missing required tag: '%s'", [
resource.address,
missing_tag
])
}
# Verificar configuración de S3 buckets
deny_public_s3_buckets[msg] {
input.planned_values.root_module.resources[i].type == "aws_s3_bucket_acl"
acl := input.planned_values.root_module.resources[i]
acl.values.acl in {"public-read", "public-read-write"}
msg := sprintf("S3 bucket ACL '%s' allows public access", [acl.address])
}
Compliance y Governance
# .github/workflows/compliance.yml
name: Infrastructure Compliance
on:
pull_request:
paths: ['terraform/**']
jobs:
compliance-scan:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Checkov
uses: bridgecrewio/checkov-action@master
with:
directory: terraform/
quiet: true
output_format: sarif
output_file_path: checkov-results.sarif
- name: Upload Checkov results to GitHub
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: checkov-results.sarif
- name: Run TFSec
uses: aquasecurity/tfsec-action@v1.0.0
with:
working_directory: terraform/
format: sarif
sarif_file: tfsec-results.sarif
- name: Upload TFSec results
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: tfsec-results.sarif
- name: Run OPA Policy Check
run: |
# Install OPA
curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64_static
chmod +x opa
# Generate Terraform plan JSON
cd terraform/environments/staging
terraform init -backend=false
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# Run policy evaluation
../../../opa eval -d ../../../policies/ -i plan.json "data.terraform.security.deny_large_instances"
Monitoreo y Observabilidad de Infraestructura
Métricas y Alertas
# monitoring.tf
resource "aws_cloudwatch_dashboard" "infrastructure" {
dashboard_name = "${var.project_name}-infrastructure"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
x = 0
y = 0
width = 12
height = 6
properties = {
metrics = [
["AWS/ApplicationELB", "RequestCount", "LoadBalancer", aws_lb.main.arn_suffix],
[".", "TargetResponseTime", ".", "."],
[".", "HTTPCode_ELB_5XX_Count", ".", "."],
[".", "HTTPCode_Target_2XX_Count", ".", "."]
]
view = "timeSeries"
stacked = false
region = var.aws_region
title = "Application Load Balancer Metrics"
period = 300
}
},
{
type = "metric"
x = 0
y = 6
width = 12
height = 6
properties = {
metrics = [
["AWS/AutoScaling", "GroupDesiredCapacity", "AutoScalingGroupName", aws_autoscaling_group.app.name],
[".", "GroupInServiceInstances", ".", "."],
[".", "GroupPendingInstances", ".", "."],
[".", "GroupTerminatingInstances", ".", "."]
]
view = "timeSeries"
region = var.aws_region
title = "Auto Scaling Group Metrics"
period = 300
}
}
]
})
}
# SNS Topic for alerts
resource "aws_sns_topic" "infrastructure_alerts" {
name = "${var.project_name}-infrastructure-alerts"
}
resource "aws_sns_topic_subscription" "email" {
topic_arn = aws_sns_topic.infrastructure_alerts.arn
protocol = "email"
endpoint = var.alert_email
}
# CloudWatch Alarms
resource "aws_cloudwatch_metric_alarm" "high_5xx_errors" {
alarm_name = "${var.project_name}-high-5xx-errors"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "HTTPCode_ELB_5XX_Count"
namespace = "AWS/ApplicationELB"
period = "300"
statistic = "Sum"
threshold = "10"
alarm_description = "This metric monitors 5xx errors from the load balancer"
alarm_actions = [aws_sns_topic.infrastructure_alerts.arn]
dimensions = {
LoadBalancer = aws_lb.main.arn_suffix
}
tags = var.common_tags
}
resource "aws_cloudwatch_metric_alarm" "high_response_time" {
alarm_name = "${var.project_name}-high-response-time"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "TargetResponseTime"
namespace = "AWS/ApplicationELB"
period = "300"
statistic = "Average"
threshold = "1"
alarm_description = "This metric monitors response time"
alarm_actions = [aws_sns_topic.infrastructure_alerts.arn]
dimensions = {
LoadBalancer = aws_lb.main.arn_suffix
}
tags = var.common_tags
}
Casos de Uso Empresariales
Migración de Infraestructura Legacy
Contexto: Empresa financiera con 200+ servidores físicos, migración a cloud.
Estrategia implementada:
- Inventario automatizado de infraestructura existente
- Creación de módulos Terraform reutilizables
- Migración por fases con validación automática
- Implementación de governance desde día uno
Resultados:
- 60% reducción en tiempo de provisioning
- 40% reducción en costos de infraestructura
- 99.9% uptime durante migración
- Cumplimiento regulatorio mantenido
Plataforma Multi-Tenant SaaS
Contexto: Startup scaling from single tenant to multi-tenant architecture.
Implementación:
- Terraform workspaces para aislamiento de tenants
- Módulos reutilizables con configuración dinámica
- CI/CD completamente automatizado
- Monitoreo por tenant
Resultados:
- Onboarding de nuevos clientes en 30 minutos
- Reducción de 80% en esfuerzo operacional
- Escalabilidad automática basada en demand
Recursos y Herramientas Recomendadas
Documentación Oficial
Herramientas de Ecosistema
- Terragrunt - Terraform wrapper for DRY configurations
- Atlantis - Terraform pull request automation
- Infracost - Cloud cost estimation
- TFLint - Terraform linter
La adopción exitosa de Infraestructura como Código requiere tanto cambios técnicos como culturales. Las organizaciones que invierten en estas prácticas ven retornos significativos en velocidad, confiabilidad y costos operacionales. El journey hacia la madurez en IaC es incremental, pero cada paso proporciona valor tangible inmediato.