temple illustration

Get started with EC2 Image Builder in Terraform

I can safely assume a lot of engineer’s know of HashCorp’s Packer utility already. Packer is simply an automated virtual machine image template maker, it can create images for all the major cloud providers. It can build Amazon Machine Images (AMI) in AWS or Azure’s Virtual Machine Image. Not too long ago, AWS released their version of automated image builder, called EC2 Image Builder! On this get started with EC2 Image Builder in Terraform I will be showing you how to quickly put together your Terraform code to create an a simple AMI.

Tools

You will need Terraform and if you are deploying this exact code then you’ll need Terragrunt too. Here’s a Setup infrastructure as code environment instructions and if you’re new to Terragrunt then checkout Intro to Terragrunt and Terraform post. I also suggest installing pre-commit too.

EC2 Image Builder Cost

This service doesn’t cost anything but the various resources created by this service could cost you. For example, you have to select your EC2 instance type to run for the during of the AMI creation. It will terminate the EC2 instance once the job is completed. Also, as you know AMI’s have EC2 snapshots, hint cost of storage. You get the point, let’s continue!

Permissions

You will need full permissions on the EC2 Image Builder service.

"imagebuilder:*"

Now the EC2 Image Builder IAM role will need at least the block below. Here I’m creating the policy and role with Terraform. You may need more less, adjust accordingly!

data "aws_iam_policy_document" "image_builder" {
  statement {
    effect = "Allow"
    actions = [
      "ssm:DescribeAssociation",
      "ssm:GetDeployablePatchSnapshotForInstance",
      "ssm:GetDocument",
      "ssm:DescribeDocument",
      "ssm:GetManifest",
      "ssm:GetParameter",
      "ssm:GetParameters",
      "ssm:ListAssociations",
      "ssm:ListInstanceAssociations",
      "ssm:PutInventory",
      "ssm:PutComplianceItems",
      "ssm:PutConfigurePackageResult",
      "ssm:UpdateAssociationStatus",
      "ssm:UpdateInstanceAssociationStatus",
      "ssm:UpdateInstanceInformation",
      "ssmmessages:CreateControlChannel",
      "ssmmessages:CreateDataChannel",
      "ssmmessages:OpenControlChannel",
      "ssmmessages:OpenDataChannel",
      "ec2messages:AcknowledgeMessage",
      "ec2messages:DeleteMessage",
      "ec2messages:FailMessage",
      "ec2messages:GetEndpoint",
      "ec2messages:GetMessages",
      "ec2messages:SendReply",
      "imagebuilder:GetComponent",

    ]
    resources = ["*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:List",
      "s3:GetObject"
    ]
    resources = ["*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:PutObject"
    ]
    resources = ["arn:aws:s3:::${var.aws_s3_log_bucket}/image-builder/*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "logs:CreateLogStream",
      "logs:CreateLogGroup",
      "logs:PutLogEvents"
    ]
    resources = ["arn:aws:logs:*:*:log-group:/aws/imagebuilder/*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "kms:Decrypt"
    ]
    resources = ["*"]
    condition {
      test     = "ForAnyValue:StringEquals"
      variable = "kms:EncryptionContextKeys"

      values = [
        "aws:imagebuilder:arn"
      ]
    }

    condition {
      test     = "ForAnyValue:StringEquals"
      variable = "aws:CalledVia"

      values = [
        "imagebuilder.amazonaws.com"
      ]
    }
  }
}

EC2 Image Builder features

It has automated pipelines! You can set it to build your AMI on a schedule or on-demand. It has a package installer and security components. You can share the AMI across multiple AWS accounts. You can read more about its features at EC2 Image Builder Features.

EC2 Image Builder Pipeline

I’m setting this pipeline to run every Tuesday morning at 8 am. This pipeline will trigger on that schedule and if there are any updates available. I have enabled testing of the image and setting a timeout of 60 minutes.

resource "aws_imagebuilder_image_pipeline" "this" {
  image_recipe_arn                 = aws_imagebuilder_image_recipe.this.arn
  infrastructure_configuration_arn = aws_imagebuilder_infrastructure_configuration.this.arn
  name                             = "amazon-linux-baseline"
  status                           = "ENABLED"
  description                      = "Creates an Amazon Linux 2 image."

  schedule {
    schedule_expression = "cron(0 8 ? * tue)"
    # This cron expressions states every Tuesday at 8 AM.
    pipeline_execution_start_condition = "EXPRESSION_MATCH_AND_DEPENDENCY_UPDATES_AVAILABLE"
  }

  # Test the image after build
  image_tests_configuration {
    image_tests_enabled = true
    timeout_minutes     = 60
  }

  tags = {
    "Name" = "${var.ami_name_tag}-pipeline"
  }
}

EC2 Image Builder Recipe

In the Image Recipe, I’m defining the AMI’s volume size and type, and the components. For this simple example, I’m only installing the CloudWatch agent to the AMI.

resource "aws_imagebuilder_image" "this" {
  distribution_configuration_arn   = aws_imagebuilder_distribution_configuration.this.arn
  image_recipe_arn                 = aws_imagebuilder_image_recipe.this.arn
  infrastructure_configuration_arn = aws_imagebuilder_infrastructure_configuration.this.arn

  depends_on = [
    data.aws_iam_policy_document.image_builder
  ]
}

resource "aws_imagebuilder_image_recipe" "this" {
  block_device_mapping {
    device_name = "/dev/xvdb"

    ebs {
      delete_on_termination = true
      volume_size           = var.ebs_root_vol_size
      volume_type           = "gp3"
    }
  }

  component {
    component_arn = aws_imagebuilder_component.cw_agent.arn
  }

  name         = "amazon-linux-recipe"
  parent_image = "arn:${data.aws_partition.current.partition}:imagebuilder:${data.aws_region.current.name}:aws:image/amazon-linux-2-x86/x.x.x"
  version      = var.image_receipe_version
}

resource "aws_s3_bucket_object" "cw_agent_upload" {
  bucket = var.aws_s3_bucket_object
  key    = "/files/amazon-cloudwatch-agent-linux.yml"
  source = "${path.module}/files/amazon-cloudwatch-agent-linux.yml"
  # If the md5 hash is different it will re-upload
  etag = filemd5("${path.module}/files/amazon-cloudwatch-agent-linux.yml")
}

data "aws_kms_key" "image_builder" {
  key_id = "alias/image-builder"
}

# Amazon Cloudwatch agent component
resource "aws_imagebuilder_component" "cw_agent" {
  name       = "amazon-cloudwatch-agent-linux"
  platform   = "Linux"
  uri        = "s3://${var.aws_s3_bucket_object}/files/amazon-cloudwatch-agent-linux.yml"
  version    = "1.0.0"
  kms_key_id = data.aws_kms_key.image_builder.arn

  depends_on = [
    aws_s3_bucket_object.cw_agent_upload
  ]
}

EC2 Image Builder Infrastructure Configuration

Select the EC2 instance type, the IAM role, security group, subnet, logging bucket, and much more in the infrastructure configuration resource.

resource "aws_imagebuilder_infrastructure_configuration" "this" {
  description           = "Simple infrastructure configuration"
  instance_profile_name = var.ec2_iam_role_name
  instance_types        = ["t2.micro"]
  key_pair              = var.aws_key_pair_name
  name                  = "amazon-linux-infr"
  security_group_ids    = [data.aws_security_group.this.id]

  subnet_id                     = data.aws_subnet.this.id
  terminate_instance_on_failure = true

  logging {
    s3_logs {
      s3_bucket_name = var.aws_s3_log_bucket
      s3_key_prefix  = "image-builder"
    }
  }

  tags = {
    Name = "amazon-linux-infr"
  }
}

EC2 Image Builder Distribution Configuration

Here you can choose to share this AMI with other accounts or it’s just for this account. You can tag the AMI in this resource too.

resource "aws_imagebuilder_distribution_configuration" "this" {
  name = "local-distribution"

  distribution {
    ami_distribution_configuration {
     ami_tags = {
        Project = "IT"
      }

      name = "amzn-linux-{{ imagebuilder:buildDate }}"

      launch_permission {
        user_ids = ["123456789012"]
      }
    }
    region = var.aws_region
  }
}

Tests the AMI

Since I have enabled testing of the AMI EC2 Image Builder, it will create an instance from the AMI automatically.

build and test instances
The build and test instances
The AMI

Complete Code

A lot of other files and code aren’t shown here. You can find a completed working example at https://github.com/masterwali/ec2-image-builder

Get notified when new posts are published, sign up below!

code coding computer cyberspace

AWS Three-Tier VPC with ALB in Terraform

This AWS Three-Tier VPC with ALB in Terraform is the second part of AWS Three-Tier VPC network with Terraform. In the first post I had created many of the VPC components; such as the VPC, app subnets, web subnets, data subnets, route tables for each subnet, internet and NAT gateways, NACLs for each subnet, and a generic security group. In this post I’ll reveal the Terraform code for creating an Elastic Load Balancer, specifically the Application Load Balancer (ALB). The ALB will require a listener, so I’ll add that too. For simplicity purposes the listener will monitor a non-secure port 80. Remember to use certificates and port 443 in production!

Cost

Charges may occur on various resources created using this module. The load balancers have a low hourly price set.

The design

Here’s the original diagram from the previous post.

three-tier vpc diagram

ALB (Application Load Balancer)

The ALB diagram. The ALB does a health check on port 80 on every instance in the target group. Whichever instance is healthy it will use that. In this drawing there’s only one web server; your production should have at least two that are in different availability zones.

alb diagram
The ALB

ALB Terraform Code

# ALB for the web servers
resource "aws_lb" "web_servers" {
  name               = format("%s-alb", var.vpc_name)
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.web.id]
  subnets            = aws_subnet.public.*.id
  enable_http2       = false
  enable_deletion_protection = true

  tags = {
    Name = format("%s-alb", var.vpc_name)
  }
}

This is an example of an external ALB. Note the “internal = false” statement. The load balancer type is what makes this an application load balancer. The other options are network or gateway. Since this is an ALB, it does require a security group. This external load balancer will be required to be in public subnets so outside users can reach the web servers which are hosted in private subnets. That’s one of the design features that makes this a three-tier VPC! See the Terraform documentation on more options.

Target Groups

Let’s make the ALB useful by creating a target group and a load balancer listener.

# Target group for the web servers
resource "aws_lb_target_group" "web_servers" {
  name     = "sharepoint-web-servers-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.this.id
}

resource "aws_lb_listener" "front_end" {
  load_balancer_arn = aws_lb.web_servers.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web_servers.arn
  }
}

An empty target group is not useful either. From now on when you launch a new EC2 web server be sure to add it to the target group like this.

# Find the target group
data "aws_lb_target_group" "web_servers" {
  name = "sharepoint-web-servers-tg"
}

# Attach an EC2 instance to the target group on port 80
resource "aws_lb_target_group_attachment" "web" {
  target_group_arn = aws_lb_target_group.arn
  target_id        = aws_instance.web.id
  port             = 80
}
target group registered targets
EC2 added to a target group

Notice the URL is using the ALB’s DNS record to reach the Nginx web server.

website
Nginx website using the ALB URL

Here’s the complete terraform module code: https://github.com/masterwali/tf-module-aws-three-tier-network-vpc

Here’s an example on how to call or reference the module.

module "sharepoint_network" {
  source = "git@github.com:masterwali/tf-module-aws-three-tier-network-vpc.git"
  
  aws_cli_profile = var.aws_cli_profile
  additional_tags = var.additional_tags
}

Subscribe to the newsletter for more tutorials!

Full Disclosure: I am an AWS employee, this post is opinion of my own.

web text

AWS Three-Tier VPC network with Terraform

A three-tier network is an enterprise architecture to deliver the best performance and security to the end-users. Each component of the design is separated into tiers. Reminder, a typical three-tier network consists of a website then the application then the database from an end-user perspective. Not every website automatically works like that. The developers and engineers have to build the web application by separating the user interface from the logic and data. An AWS three-tier VPC network is not too difficult to build in the cloud, either. However, in this post, I’ll be using Terraform and Terragrunt to build and deploy an AWS three-tier VPC network using, of course, VPC, subnets, route tables, network access control lists (NACLs), and few other VPC parts. Next, I’ll share with you how to create the AWS application load balancer (ALB) and the target groups with health checks in another post.

The design

This will be a simple start and in future posts, I’ll add more details. This AWS three-tier VPC network module will create a VPC, subnets, Network Access Control Lists (NACLs), Internet Gateway, NAT Gateways, route tables, Elastic IPs, and few other resources using Terraform and I’ll deploy it with Terragrunt.

Notice there are two NAT gateways, this provides high availability and fault tolerance. This means if the NAT Gateway in the availability zone (AZ) A fails or gets corrupted then the EC2 instances in AZ B will still be able to function as expected. It’s a little bit more costly… it all depends on your requirements.

The Terraform Module

This module will be generic so I can reuse the three-tier VPC network over and over again. By creating a module it makes my main code a lot less, therefore it will be a lot cleaner to view and understand.

What’s Terraform and Terragrunt? Well visit my Intro to Terragrunt and Terraform post first then come back here! You can name your module anything you like… I named my three-tier-vpc. Reminder in Terraform we can use one or more .tf files to build a module. I’ll be separating this module into few different Terraform files just for organizational purposes. Here’s my structure.

aws/tf-modules/three-tier-vpc
               ├── README.md
               ├── gateways.tf
               ├── main.tf
               ├── nacls.tf
               ├── outputs.tf
               ├── routes.tf
               ├── sec-grps.tf
               └── vars.tf

AWS VPC

Let’s start with the main.tf file which contains the VPC resource. The VPC CIDR will be a variable so we can plugin any CIDR during deployment.

# Create the VPC
resource "aws_vpc" "this" {
  cidr_block           = var.vpc_cidr
  instance_tenancy     = "default"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = merge(
    var.additional_tags,
    {
      Name = "${var.vpc_name}-vpc"
    }
  )
}

In my module I’m giving the vpc_cidr a default value but you’re not required to do so. This default won’t hurt because if this network range is already taken then terraform apply will fail cleanly.

# vars.tf
variable "vpc_cidr" {
  type    = string
  default = "10.0.0.0/16"
}

The public subnet CIDR blocks will be variable too. Never hardcode values in modules that could or should be changed. The public_subnet_cidrs is a list of one or many CIDR blocks. If we provide one string value in the block then it will create one public subnet. If we provide four CIDR blocks then it will create four, how sweet is that!

# Create the public subnets
resource "aws_subnet" "public" {
  count = length(var.public_subnet_cidrs)

  vpc_id                  = aws_vpc.this.id
  cidr_block              = var.public_subnet_cidrs[count.index]
  availability_zone       = "${var.aws_region}${var.zones[count.index]}"
  map_public_ip_on_launch = true

  tags = {
    Name = "${var.vpc_name}-public-subnet-${var.zones[count.index]}"
  }
}

The map_public_ip_on_launch attribute set to true is one of the configurations that makes this a public subnet. Just because we named it public doesn’t make it public. Next are the private subnets. Note the map_public_ip_on_launch is now set to false.

# Create the web subnets
resource "aws_subnet" "web" {
  count = length(var.web_subnet_cidrs)

  vpc_id                  = aws_vpc.this.id
  cidr_block              = var.web_subnet_cidrs[count.index]
  availability_zone       = "${var.aws_region}${var.zones[count.index]}"
  map_public_ip_on_launch = false

  tags = {
    Name = "${var.vpc_name}-web-subnet-${var.zones[count.index]}"
  }
}

The rest of the main.tf file contains the resources for the data and the app subnets, these subnets are non-public too.

VPC Gateways

In this three-tier VPC network architecture, we have only a few public subnets (meaning the instances launched in the public subnets will get Public IPs; which makes them routable without a NAT gateway). The public subnets will need an internet gateway to complete the design. The private or non-public subnets will need a NAT gateway to allow instances with just private IPs to communicate to the internet (AKA outbound internet access). NAT gateways need 1) a public IP, 2) added to your public subnet for each availability zone you are using!

# Add internet gateway
resource "aws_internet_gateway" "this" {
  vpc_id = aws_vpc.this.id

  tags = {
    Name = "${var.vpc_name}-internet-gateway"
  }
}

# Charges may occur

# Reserve EIPs
resource "aws_eip" "nat_a" {
  vpc = true

  tags = {
    Name = "${var.vpc_name}-eip-nat-a"
  }

}

# NAT Gateway in AZ A
resource "aws_nat_gateway" "zone_a" {
  allocation_id = aws_eip.nat_a.id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.vpc_name}-nat-gateway-aza"
  }

  depends_on = [
    aws_subnet.public
  ]
}

# Reserve EIPs
resource "aws_eip" "nat_b" {
  vpc = true

  tags = {
    Name = "${var.vpc_name}-eip-nat-b"
  }

}

# NAT Gateway in AZ B
resource "aws_nat_gateway" "zone_b" {
  allocation_id = aws_eip.nat_b.id
  subnet_id     = aws_subnet.public[1].id

  tags = {
    Name = "${var.vpc_name}-nat-gateway-azb"
  }

  depends_on = [
    aws_subnet.public
  ]
}

Note one of the comments in the code, EIP’s and the gateways may cost you for the duration of their existence.

VPC Routes

Every time a VPC is created the main route table is automatically provisioned, too. I’m going to tag it and create my own route tables for this module. Do note I do have some hardcoded values in this module. I have decided this will be for two AZ’s and no more so I hardcoded the index value of zero and one in the resources below.

# Tag the main route table
resource "aws_ec2_tag" "main_route_table" {
  resource_id = aws_vpc.this.main_route_table_id
  key         = "Name"
  value       = "${var.vpc_name}-main-route-table"
}

# Create route table for the public subnets
# Uses IG
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.this.id
  }

  tags = {
    Name = "${var.vpc_name}-public-route-table"
  }

  depends_on = [
    aws_internet_gateway.this
  ]
}

# Associate the public subnets with the public route table
resource "aws_route_table_association" "public" {
  count = length(var.public_subnet_cidrs)

  subnet_id      = element(aws_subnet.public.*.id, count.index)
  route_table_id = aws_route_table.public.id
}

# Create a route table for the web and app subnets in AZ A
# Uses NAT gateway in AZ A
resource "aws_route_table" "private_aza" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.zone_a.id
  }

  tags = {
    Name = "${var.vpc_name}-private-route-table-aza"
  }

  depends_on = [
    aws_nat_gateway.zone_a
  ]
}

# Create a route table for the web and app subnets in AZ B
# Uses NAT gateway in AZ B
resource "aws_route_table" "private_azb" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.zone_b.id
  }

  tags = {
    Name = "${var.vpc_name}-private-route-table-azb"
  }

  depends_on = [
    aws_nat_gateway.zone_b
  ]
}

resource "aws_route_table" "data" {
  vpc_id = aws_vpc.this.id

  tags = {
    Name = "${var.vpc_name}-data-route-table"
  }
}

# Associate these subnets with the private route tables accordingly 
resource "aws_route_table_association" "web_aza" {
  subnet_id      = aws_subnet.web[0].id
  route_table_id = aws_route_table.private_aza.id
}

resource "aws_route_table_association" "app_aza" {
  subnet_id      = aws_subnet.app[0].id
  route_table_id = aws_route_table.private_aza.id
}

resource "aws_route_table_association" "web_azb" {
  subnet_id      = aws_subnet.web[1].id
  route_table_id = aws_route_table.private_azb.id
}

resource "aws_route_table_association" "app_azb" {
  subnet_id      = aws_subnet.app[1].id
  route_table_id = aws_route_table.private_azb.id
}

resource "aws_route_table_association" "data" {
  count = length(var.data_subnet_cidrs)

  subnet_id      = element(aws_subnet.data.*.id, count.index)
  route_table_id = aws_route_table.data.id
}

The routes are another set of configurations that differentiates public and internal sub-network. In the public subnet route table, all non-local traffic will be sent to the internet gateway. A private route table sends its non-local traffic to the NAT gateway which then routes it to the internet gateway and back.

VPC NACLs

Now it’s time to control which traffic is allowed or denied to this network. You may need to modify these rules to meet your requirements. For most web application projects, the public NACLs will most likely look like this.

# Public NACLS
resource "aws_network_acl" "public" {
  vpc_id     = aws_vpc.this.id
  subnet_ids = [aws_subnet.public[0].id, aws_subnet.public[1].id]

  # Ingress rules
  # Allow all local traffic
  ingress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = aws_vpc.this.cidr_block
    from_port  = 0
    to_port    = 0
  }

  # Allow HTTPS traffic from the internet
  ingress {
    protocol   = "6"
    rule_no    = 105
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 443
    to_port    = 443
  }

  # Allow HTTP traffic from the internet
  ingress {
    protocol   = "6"
    rule_no    = 110
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 80
    to_port    = 80
  }

  # Allow the ephemeral ports from the internet
  ingress {
    protocol   = "6"
    rule_no    = 120
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  ingress {
    protocol   = "17"
    rule_no    = 125
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  # Egress rules
  # Allow all ports, protocols, and IPs outbound
  egress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  tags = {
    Name = "${var.vpc_name}-public-nacl"
  }

  depends_on = [aws_subnet.public]
}

The web subnet NACLs for this module.

resource "aws_network_acl" "web" {
  vpc_id     = aws_vpc.this.id
  subnet_ids = [aws_subnet.web[0].id, aws_subnet.web[1].id]

  # Ingress rules
  # Allow all local traffic
  ingress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = aws_vpc.this.cidr_block
    from_port  = 0
    to_port    = 0
  }

  # Allow HTTP web traffic from anywhere
  ingress {
    protocol   = 6
    rule_no    = 105
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 80
    to_port    = 80
  }

  # Allow HTTPS web traffic from anywhere
  ingress {
    protocol   = 6
    rule_no    = 110
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 443
    to_port    = 443
  }

  # Allow the ephemeral ports from the internet
  ingress {
    protocol   = "6"
    rule_no    = 120
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  ingress {
    protocol   = "17"
    rule_no    = 125
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  # Egress rules
  # Allow all ports, protocols, and IPs outbound
  egress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  tags = {
    Name = "${var.vpc_name}-web-nacl"
  }
}

The App and Data subnet NACLs have been set to allow only local traffic. You may adjust these rules.

VPC Security Group

A default security group is created every time a new VPC is provisioned. Here I’ll just give it some tags and few generic rules.

# Modify the default security group
resource "aws_default_security_group" "this" {
  vpc_id = aws_vpc.this.id

  dynamic "ingress" {
    for_each = var.default_security_group_ingress
    content {
      self        = lookup(ingress.value, "self", null)
      cidr_blocks = compact(split(",", lookup(ingress.value, "cidr_blocks", "")))
      description = lookup(ingress.value, "description", null)
      from_port   = lookup(ingress.value, "from_port", 0)
      to_port     = lookup(ingress.value, "to_port", 0)
      protocol    = lookup(ingress.value, "protocol", "-1")
    }
  }

  dynamic "egress" {
    for_each = var.default_security_group_egress
    content {
      self        = lookup(egress.value, "self", null)
      cidr_blocks = compact(split(",", lookup(egress.value, "cidr_blocks", "")))
      description = lookup(egress.value, "description", null)
      from_port   = lookup(egress.value, "from_port", 0)
      to_port     = lookup(egress.value, "to_port", 0)
      protocol    = lookup(egress.value, "protocol", "-1")
    }
  }

  tags = merge(
    {
      Name = format("%s-default-security-group", var.vpc_name)
    },
    var.additional_tags
  )
}

Now the values for this security group are passed as a variable like so. Be sure to change the ports and protocols to meet your needs.

variable "default_security_group_ingress" {
  description = "List of maps of ingress rules to set on the default security group"
  type        = list(map(string))
  default = [
    {
      cidr_blocks = "10.0.0.0/16"
      description = "Allow all from the local network."
      from_port   = 0
      protocol    = "-1"
      self        = false
      to_port     = 0
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all HTTPS from the internet."
      from_port   = 443
      protocol    = "6"
      self        = false
      to_port     = 443
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all HTTP from the internet."
      from_port   = 80
      protocol    = "6"
      self        = false
      to_port     = 80
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all ephemeral ports from the internet."
      from_port   = 32768
      protocol    = "6"
      self        = false
      to_port     = 60999
    }
  ]
}

variable "default_security_group_egress" {
  description = "List of maps of egress rules to set on the default security group"
  type        = list(map(string))
  default = [
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all"
      from_port   = 0
      protocol    = "-1"
      self        = false
      to_port     = 0
    }
  ]
}

Here’s a link to the complete code so far in the dev branch. That’s it all for now! In the next post, we’ll add application load balancers, target groups, listeners, etc. For a more in-depth explanation of VPC resources checkout AWS technical documentation. Be sure to subscribe for more content like this!

AWS Service Control Policies with Terraform

AWS Organizations

A cloud service designed to centralize & manage AWS accounts and to roll up billing from multiple AWS accounts into a single account. May be referred to as the “master” account because it can manage permissions of all its accounts that are “attached” to it. “Billing” is another name for this account because it’s the account that gets the invoice or the monthly charge. You can select any commercial AWS account to play this role. The account becomes this master account as soon as you join the master’s accounts organization. Next comes AWS’s Service Control Policies, this feature allows permission management for all your AWS accounts in your organization. In this post, I’ll share with you how to implement AWS’s service control policies with Terraform! Let’s break it down.

How to join an AWS Organization

Check out my previous post on the details.

AWS Organization Units

This has nothing to do with Microsoft’s Active Directory (AD) organization units. There’s no integration between MS AD and this, either. AWS’s organization units are a way of grouping AWS accounts so we can manage accounts permissions in groups or in a hierarchical format. There are dozens of ways to create this hierarchy and it all depends on your objective. You can group them by projects, departments, missions, environments, classifications, etc.

Simple OU structure by environments

our simple structure
AWS OU by environments

Service Control Policies with Terraform

Service Control Policies (SCP) is a critical feature to learn and understand. Questions related to this feature is a topic on many, many AWS certifications. This feature is highly useful and it’s used a lot in the real world. It’s these policies that can allow or deny actions or services at a high level. What I mean by “high level” is outside of the AWS’s account. For example; let’s say you want to experiment with the most expensive EC2 instance type and let’s say you also have IAM permissions to allow those actions. Then you try to launch an EC2 with an instance type of p4d.24xlarge and bam you get the encoded authorized failure message! How?! You have given yourself full permissions! Bingo… it’s the service control policies. Even with a full administration policy on the account, you can still be denied by SCP.

It’s best practice to enable SCP and create the OUs, and policies and attach the policies even before building your systems! For information on SCP checkout AWS’s documents.

Show me the Terraform code!

I personally like to break down the main.tf terraform file into separate manageable terraform files. Here’s my file structure. Notice there’s only one environment aka account “master” here. Yes, I also use terragrunt.

org/
├── README.md
├── dev-ou.tf
├── main.tf
├── master
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── prod-ou.tf
├── root-ou.tf
├── staging-ou.tf
└── terragrunt.hcl

main.tf this file usually contains common code. Enable SCP by adding the “SERVICE_CONTROL_POLICY” to the enabled_policy_types array.

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

 terraform {
   backend "s3" {}
 }

# Provides a resource to create an AWS organization.
resource "aws_organizations_organization" "this" {

  # List of AWS service principal names for which 
  # you want to enable integration with your organization
  aws_service_access_principals = [
    "cloudtrail.amazonaws.com",
    "config.amazonaws.com",
  ]

  feature_set = "ALL"

  enabled_policy_types = [
    "TAG_POLICY",
    "SERVICE_CONTROL_POLICY"
  ]
}

my root-out.tf contains the master account code and all the service control policies that I want to be applied to all accounts. Notice it’s attached to the root OU which then it’s inherited by all the accounts below the root OU!

resource "aws_organizations_account" "master" {
  # A friendly name for the member account
  name  = "my-master"
  email = "mymaster@email.com"

  # Enables IAM users to access account billing information 
  # if they have the required permissions
  # iam_user_access_to_billing = "ALLOW"

  tags = {
    Name  = "my-master"
    Owner = "Waleed"
    Role  = "billing"
  }

  parent_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------------------- # 
# Service Control Policies for all accounts
# ---------------------------------------- #

# ---------------------------- #
# REGION RESTRICTION 
# ---------------------------- #

data "aws_iam_policy_document" "restrict_regions" {
  statement {
    sid       = "RegionRestriction"
    effect    = "Deny"
    actions   = ["*"]
    resources = ["*"]

    condition {
      test     = "StringNotEquals"
      variable = "aws:RequestedRegion"

      values = [
        "us-east-1"
      ]
    }
  }
}

resource "aws_organizations_policy" "restrict_regions" {
  name        = "restrict_regions"
  description = "Deny all regions except US East 1."
  content     = data.aws_iam_policy_document.restrict_regions.json
}

resource "aws_organizations_policy_attachment" "restrict_regions_on_root" {
  policy_id = aws_organizations_policy.restrict_regions.id
  target_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------- #
# EC2 INSTANCE TYPE RESTRICTION 
# ---------------------------- #

data "aws_iam_policy_document" "restrict_ec2_types" {
  statement {
    sid       = "RestrictEc2Types"
    effect    = "Deny"
    actions   = ["ec2:RunInstances"]
    resources = ["arn:aws:ec2:*:*:instance/*"]

    condition {
      test     = "StringNotEquals"
      variable = "ec2:InstanceType"

      values = [
        "t3*",
        "t4g*",
        "a1.medium",
        "a1.large"
      ]
    }
  }
}

resource "aws_organizations_policy" "restrict_ec2_types" {
  name        = "restrict_ec2_types"
  description = "Allow certain EC2 instance types only."
  content     = data.aws_iam_policy_document.restrict_ec2_types.json
}

resource "aws_organizations_policy_attachment" "restrict_ec2_types_on_root" {
  policy_id = aws_organizations_policy.restrict_ec2_types.id
  target_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------- #
# REQUIRE EC2 TAGS 
# ---------------------------- #

data "aws_iam_policy_document" "require_ec2_tags" {
  statement {
    sid    = "RequireTag"
    effect = "Deny"
    actions = [
      "ec2:RunInstances",
      "ec2:CreateVolume"
    ]
    resources = [
      "arn:aws:ec2:*:*:instance/*",
      "arn:aws:ec2:*:*:volume/*"
    ]

    condition {
      test     = "Null"
      variable = "aws:RequestTag/Name"

      values = ["true"]
    }
  }
}

resource "aws_organizations_policy" "require_ec2_tags" {
  name        = "require_ec2_tags"
  description = "Name tag is required for EC2 instances and volumes."
  content     = data.aws_iam_policy_document.require_ec2_tags.json
}

resource "aws_organizations_policy_attachment" "require_ec2_tags_on_root" {
  policy_id = aws_organizations_policy.require_ec2_tags.id
  target_id = aws_organizations_organization.this.roots[0].id
}

Here’s an authorization failure message when I attempted to create an EC2 with an instance type that was not approved in the SCP defined above!

The encoded authorization failure message from the console.
Decoded message shows the deny statement from the SCP
Failed to create the volume because it now requires a Name tag in order to create a volume.
Name tag denied statement from SCP is shown once the failure message is decoded.

Account or OU specific SCP

What if you want to restrict certain actions or services on a single account. The code below shows how to block the internet in this environment. This says to create the prod account, create the SCP and only attach it directly to this OU; and this OU only has the prod account. To attach an SCP to an account only check the documentation.

prod-ou.tf

resource "aws_organizations_account" "prod" {
  # A friendly name for the member account
  name  = "my-prod"
  email = "my-prod@email.com"

  # Enables IAM users to access account billing information 
  # if they have the required permissions
  iam_user_access_to_billing = "ALLOW"

  tags = {
    Name  = "my-prod"
    Owner = "Waleed"
    Role  = "prod"
  }

  parent_id = aws_organizations_organizational_unit.prod.id
}

resource "aws_organizations_organizational_unit" "prod" {
  name      = "prod"
  parent_id = aws_organizations_organization.this.roots[0].id
}

# ------------------------------- #
# PREVENT INTERNET ACCESS TO A VPC 
# ------------------------------- #

data "aws_iam_policy_document" "block_internet" {
  statement {
    sid    = "BlockInternet"
    effect = "Deny"
    actions = [
      "ec2:AttachInternetGateway",
      "ec2:CreateInternetGateway",
      "ec2:CreateEgressOnlyInternetGateway",
      "ec2:CreateVpcPeeringConnection",
      "ec2:AcceptVpcPeeringConnection",
      "globalaccelerator:Create*",
      "globalaccelerator:Update*"
    ]
    resources = ["*"]

  }
}

resource "aws_organizations_policy" "block_internet" {
  name        = "block_internet"
  description = "Block internet access to the production network."
  content     = data.aws_iam_policy_document.block_internet.json
}

resource "aws_organizations_policy_attachment" "block_internet_on_prod" {
  policy_id = aws_organizations_policy.block_internet.id
  target_id = aws_organizations_organizational_unit.prod.id
}

I think you get the idea, go forth and implement governance with AWS’s organizations and Service Control Policies! Subscribe for more cloud code!

AWS CloudShell

AWS CloudShell was just announced this month (December 2020). Let’s go over what is AWS CloudShell, what are its use cases, when you shouldn’t use it, and much more.

What is AWS CloudShell?

We know this feature is not new to the cloud. We know Azure and Google Cloud engineers are familiar with this. As a AWS fanatic, this is great new feature!

  • AWS CloudShell’s permissions are managed by IAM
  • Inactive and long-running sessions are automatically stopped and recycled
  • Create start-up scripts
  • Available on the latest versions of Google Chrome, Mozilla Firefox, Microsoft Edge, and Apple Safari
  • Sudo privileges to install and modify your session
  • Since it’s Amazon Linux 2, you can use yum to manage its packages

AWS CloudShell IAM permissions

By default you may not have the appropriate IAM permissions to use the CloudShell. You may see various unauthorized error messages if you attempt to launch it. Example error message:”Unable to start the environment…is not authorized to perform: cloudshell:{action} on resource…” Going with the principle of least privilege, here’s what the minimum IAM permissions required to get started!

{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "AllowUsersCloudShell",
        "Effect": "Allow",
        "Action": [
            "cloudshell:CreateSession",
            "cloudshell:CreateEnvironment",
            "cloudshell:GetEnvironmentStatus",
            "cloudshell:PutCredentials"
        ],
        "Resource": "*"
    }]
}

Additionally you can allow your engineers to upload and download files from/to the user’s local machine to the AWS CloudShell! Just add these IAM actions.

cloudshell:GetFileDownloadUrls
cloudshell:GetFileUploadUrls

AWS CloudShell Cost

Zero! Nada! The shell itself and its storage does not incur any charges but of course any resources you create from it may.

Launch AWS CloudShell

Currently, at the time of this writing (December 2020) it’s available in these regions.

  • US East (Ohio)
  • US East (N. Virginia)
  • US West (Oregon)
  • Asia Pacific (Tokyo)
  • Europe (Ireland)

Once logged into the AWS console, there’s a new icon on the left of the Notifications icon (bell). Click the icon with the greater than and underscore icon .

AWS CloudShell
AWS CloudShell

It’s loads your IAM permissions for you automatically! Next let’s run some AWS CLI commands… and notice we don’t have to provide the region or profile! My user account is allowed to list IAM users so this works right away.

$ aws iam list-users

# result
{
    "Users": [
        {
            "Path": "/",
            "UserName": "waleed",
            "UserId": "ABCAEFEFAEAFEAFE1",
            "Arn": "arn:aws:iam::1234567890:user/waleed",
            "CreateDate": "2020-12-16T02:05:25+00:00",
            "PasswordLastUsed": "2020-12-16T02:08:50+00:00"
        }
    ]
}

Switch shell

By default it’s the bash shell represented by the dollar ($) sign. Switch to PowerShell by typing pwsh. PowerShell is represented by the letters PS. Finally if you want to use Z shell enter zsh at any time. Z shell is represented by the percentage (%) symbol. If you want to switch back to Bash, enter bash.

Actions and options

Pretty simple and self-explanatory.

What’s it good for?

  • Ad-hoc actions to query information
  • Check permissions and such incase if your AWS CLI profile is broken on your workstation or elsewhere
  • Learning environment

What’s it not good for?

  • Long term development
  • CI/CD
  • Production workloads
  • It’s your responsibility to manage all user installed software/packages!
  • You will not be able to access your private EC2s, it’s a not VPN solution.
  • Users can access the internet, this may not be allowed in your organization

That’s all for now, read more about AWS CloudShell. Subscribe for more tutorial like this!

TOP 13 CLOUD ENGINEER POSITION INTERVIEW QUESTIONS AND ANSWERSBe prepared for you interview!
yellow letter tiles

Intro to Terragrunt and Terraform

Terragrunt is a command line interface tool to make Terraform better or build a better infrastructure as code pipeline. Terragrunt is built on the concept DRY. As their website states, DRY stands for “Don’t Repeat Yourself”. Terragrunt can help with structuring your code directories where you can write the Terraform code once and apply the same code with different variables and different remote state locations for each environment. Another useful feature of Terragrunt is before and after hooks. If you are developer you know you’ll need this feature at some point. Let’s get started with intro to Terragrunt and Terraform.

Installing Terraform and Terragrunt

https://cloudly.engineer/2020/setup-infrastructure-as-code-environment/aws/

Terraform code: resource

Let’s define the anatomy of a Terraform resource.

resource "aws_instance" "this" {
  ami           = data.aws_ami.centos.id
  instance_type = "t3a.medium"
}

Let’s break down this small piece of code.

  1. resource (no quotes) is a reserved keyword; means create or to ensure this type of resource exists
  2. aws_instance” (double quotes with underscores) is the type of a resource you want to create. Here’s a list of them available today.Terraform attempts to always be up to date but it could be missing resource types or some features of a resource. Most of the time, it has all the core resource types and options available.
  3. this (double quotes with underscores) The last part of the first line is a name you want to give this resource for Terraform’s state file. My best practice is to always use “this” unless you have multiple of the same resource then be specific but don’t put the resource type in this name. That’s redundant nonsense.
  4. Within the braces it’s always one or more required or optional variables. They diff from resource to resource.

    If you want to learn more Terraform click here.

Terragrunt Code: Deploy Resources

Terraform code just defines our infrastructure as code, so then we need Terragrunt to help with the multiple environment deployment. Now the combination of these two will prevent us from repeating our code for however many AWS accounts we have. Below is my Terragrunt project for my AWS account “settings” repository.

└── settings
    ├── README.md
    ├── dev
    │   └── terragrunt.hcl
        └── inputs.yml
        └── vars.tf
    ├── qa
    │   └── terragrunt.hcl
        └── inputs.yml
        └── vars.tf
    ├── sec
    │   └── terragrunt.hcl
        └── inputs.yml
        └── vars.tf
    ├── prod
    │   └── terragrunt.hcl
        └── inputs.yml
        └── vars.tf
    └── terragrunt.hcl
    └── inputs.yml
    └── vars.tf

terragrunt.hcl The root terragrunt.hcl and environment terragrunt.hcl files are a must
inputs.yml This yml file will contain variables specific to that environment, such as AWS CLI profile name
vars.tf Terraform files for each environment and one common vars.tf for all deployments

This separation of projects allows each environments to have different Terraform versions of your code at the same time. Let’s continue to see what I mean.

Main ‘terragrunt.hcl’

remote_state {
  backend = "s3"
  config = {
    bucket  = "bucket-name-for-terraform-state"
    key     = "${path_relative_to_include()}/terraform.tfstate"
    region  = local.local_inputs.aws_region,
    profile = local.local_inputs.aws_cli_profile
    encrypt = true
  }
}

locals {
  local_inputs  = yamldecode(file("${get_terragrunt_dir()}/inputs.yml"))
  global_inputs = yamldecode(file("${get_terragrunt_dir()}/inputs.yml"))
}

inputs = merge(local.global_inputs, local.local_inputs)

remote state
I’ll be storing the Terraform state file in Amazon S3.

  1. key All of my environments/accounts Terraform state files will be stored in one AWS S3 bucket separated by environments using the directory names. The Terragrunt function path_relative_to_include() is going to help with that. But you can pass a different bucket for each environment through the local inputs.
  2. profile Since we’ll have several AWS accounts and profiles, this value will be dynamic and passed in from the environments input file.
  3. I think the rest are obvious.

locals

  1. local_inputs During Terraform plan or apply it grabs the variables for the environment to create Terraform files for that specific environment in the .terragrunt local cache directory.
  2. global_inputs contains variables that are common for all environments (if needed)

dev ‘terragrunt.hcl’

include {
  path = find_in_parent_folders()
}

terraform {
  source = "git@giturl.com:path/to/tf-modules/settings.git?ref=dev"
}

This says ‘hey go fetch the TF code from this URL but only the dev branch’. Also says ‘The Terraform backend configuration is in the main terragrunt.hcl file’. Next let’s init Terragrunt. If you haven’t already created the S3 bucket for your state file it will request to create.

I’ll be using aws cli profiles, this assumes you have already set this up.
AWS Permissions Required

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowS3ForTerraform",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketVersioning",
                "s3:CreateBucket"
            ],
            "Resource": "arn:aws:s3:::YOUR-TF-BUCKET-NAME-HERE"
        },
        {
            "Sid": "AllowDownloadNUploadtoPrefix",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::YOUR-TF-BUCKET-NAME-HERE/*"
        }
    ]
}

dev ‘inputs.yml’

aws_cli_profile: "your-env-aws-cli-profile-name"
aws_region: "us-east-1"
other_var: "other-value"

dev ‘vars.tf’

variable aws_account_alias {
  default = "acct-nickname-here"
}

variable aws_region {}

variable aws_cli_profile {}

variable other_var {}

As said before, this is environment specific values. Let’s initiate already!

cd settings/dev/
terragrunt init
Output

----------------------------------------------------------------------
.....
.....
Plan: 1 to add, 0 to change, 0 to destroy.

---------------------------------------------------------------------

Then ‘terragrunt apply’. Apply the configuration and verify.

Terragrunt cache

Don’t put terragrunt cache in git. Add the following to your .gitignore file for your Terragrunt repositories.

*.terraform*
*.terragrunt*

If you update your Terraform repository you’ll have to update your Terragrunt too to fetch the latest code. You can do that with this additional argument.

terragrunt init --terragrunt-source-update

An alternative design with Terragrunt and Terraform

In this alternative design structure you can have a main.tf file at the root of your project. This main.tf can contain all your main code in one place instead of having to create and manage several different git repositories and branches. See the structure example below. Here’s full working code example of this design.

├── README.md
├── dev
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── prod
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── qa
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── sec
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── terragrunt.hcl
├── main.tf
└── web-server-policy.tf

Create an EC2 IAM role with Terraform

There are hundreds of posts online about what is EC2 IAM role so I’m not going to discuss that here. Here we will develop the code to create an EC2 IAM role with Terraform and deploy with Terragrunt. I’m assuming you know what both of these tools are… if not checkout my introduction posts on how to setup your development environment. Or if you need a introduction to Terraform modules and how to use them click here. Do note that an EC2 IAM role can only be used for EC2 instances. The IAM policies can be shared with other resources or services though.

This Terraform module creates AWS IAM policy then creates IAM role specifically designed to be used by EC2 instances. After that it attaches the IAM role to the EC2 instance profile. Lastly attaches the IAM policy to the EC2 IAM role. Remember every IAM role needs a set of policies (permissions).

Terraform EC2 IAM role module

ece iam role Terraform module
Module structure

Here’s the main.tf file of the module.

# Create the AWS IAM role. 
resource "aws_iam_role" "this" {
  name = var.ec2_iam_role_name
  path = "/"

  assume_role_policy = var.assume_role_policy
}

# Create AWS IAM instance profile
# Attach the role to the instance profile
resource "aws_iam_instance_profile" "this" {
  name = var.ec2_iam_role_name
  role = aws_iam_role.this.name
}

# Create a policy for the role
resource "aws_iam_policy" "this" {
  name        = var.ec2_iam_role_name
  path        = "/"
  description = var.policy_description
  policy      = var.policy
}

# Attaches the policy to the IAM role
resource "aws_iam_policy_attachment" "this" {
  name       = var.ec2_iam_role_name
  roles      = [aws_iam_role.this.name]
  policy_arn = aws_iam_policy.this.arn
}

As always there’s a vars.tf

variable ec2_iam_role_name {
  type = string

  validation {
    condition     = length(var.ec2_iam_role_name) > 4 && substr(var.ec2_iam_role_name, 0, 4) == "svc-"
    error_message = "The ec2_iam_role_name value must be a valid IAM role name, starting with \"svc-\"."
  }
}

variable policy_description {
  type = string

  validation {
    condition     = length(var.policy_description) > 4
    error_message = "The policy_description value must contain more than 4 characters."
  }
}

variable assume_role_policy {}

variable policy {}

The module full code: https://github.com/masterwali/tf-module-iam-ec2-role

Terraform variable type constraints

If you noticed the “type” keyword in the variable declaration, that’s to ensure the kind of answer matches exactly what the module expects. This way someone doesn’t give an integer to a string type variable, it may break the Terraform apply but it may not error during Terraform plan. So we want to catch problems like that as soon as possible with type constraints. Now all the validations are done during the Terraform plan.

Terraform variable validation

Terraform starting with version 0.13.x released this new capability to apply validation on the answers provided to the variables. If you noticed the validation blocks above; the ‘ec2_iam_role_name’ var checks for both the length and what should be the starting characters. This is excellent way to ensure everyone’s following the naming conventions your organizations have created. The second validation on the ‘policy_description’ is just ensuring the provided value is more than 4 characters. Learn more about variable conditions.

How to use the module

In another git project we’ll have this setup.

Terragrunt ece iam role setup
Terragrunt setup
Each environment has a inputs.yml, terragrunt.hcl and vars.tf

The main.tf that will call the Terraform module. Be sure to update your git code. Now we create many, many EC2 IAM roles with the same naming convention! By default the source variable will use the master branch!

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

module "web_server" {
  source = "git@gitlab.com:cloudly-engineer/aws/tf-modules/iam-ec2-role.git"

  ec2_iam_role_name  = "svc-web-server-role"
  policy_description = "IAM ec2 instance profile for the web servers."
  assume_role_policy = file("assumption-policy.json")
  policy             = data.aws_iam_policy_document.web_server.json
}

Here’s the sample “web server” IAM policy that’s attached to this role.

data "aws_iam_policy_document" "web_server" {
  statement {
    sid    = "GetS3Stuff"
    effect = "Allow"
    actions = [
      "s3:List*",
      "s3:Get*"
    ]
    resources = ["*"]
  }
}

Then do a terragrunt init, plan and apply!

Here’s what the error message looks like when the validation fails during the plan.

Terraform ec2 iam role module
Terraform validation check

The calling terraform/terragrunt code: https://github.com/masterwali/ec2-iam-role

Subscribe to get notified when more AWS and Terraform code is published!

TOP 13 CLOUD ENGINEER POSITION INTERVIEW QUESTIONS AND ANSWERSBe prepared for you interview!

AWS IAM groups and policies – Terraform

Let’s create a module to create and manage AWS IAM groups and policies with Terraform! In summary a Terraform module is one or more Terraform resources bundled together to be used a single Terraform resource. You can learn more about Terraform module’s here. If you have no idea what Terraform or Terragrunt then start here. Let’s code!

The Terraform module structure

aws-iam-group/
main.tf
vars.tf
README.md

The main file

The main.tf contains all the resources required to create AWS IAM groups and their policies. Notice this one uses three resources!

resource "aws_iam_group" "this" {
name = var.iam_group_name
}

resource "aws_iam_policy" "this" {
name = var.policy_name
description = var.policy_description
policy = var.policy
}

resource "aws_iam_group_policy_attachment" "this" {
group = aws_iam_group.this.name
policy_arn = aws_iam_policy.this.arn
}

The vars file

variable iam_group_name {}

variable policy_name {}

variable policy_description {}

variable policy {}

Here’s the AWS IAM groups and policies Terraform module on GitHub: https://github.com/masterwali/tf-module-aws-iam-group

Create AWS IAM groups and policies with Terraform

Let’s use the Terraform module to create one or many IAM groups and their policies! You can deploy with Terraform but I like to use Terragrunt.

The Terragrunt/Terraform structure

This is one way to create your deployment structure. You can name this directory anything you like.

aws-iam
├── LICENSE
├── README.md
├── cloud-engineers-policy.tf
├── database-admins-policy.tf
├── developers-policy.tf
├── network-admins-policies.tf
├── groups.tf
├── main.tf
├── dev
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── prod
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
├── qa
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
├── sec
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
└── terragrunt.hcl

In this example we will be creating groups and policies for developers, cloud engineers, database admins, and network admins all with one and same Terraform module! Using a module will ensure consistency and governance for our AWS resources.

The Terraform files

I like to keep the main file clean. I mainly use it for data calls and local variables.

main.tf

locals {
  account_id = data.aws_caller_identity.current.account_id
}

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

data "aws_caller_identity" "current" {}

groups.tf

# CLOUD ENGINEERS
module "cloud_engineers" {
  source = "git@github.com:masterwali/tf-module-aws-iam-group.git"

  iam_group_name     = "cloud-engineers"
  policy_name        = "cloud-engineers"
  policy_description = "Cloud Engineers policy"
  policy             = data.aws_iam_policy_document.cloud_engineers.json
}
# ......... etc. 

# NETWORK ADMINS
# Modules creates group and policy and attaches policy to group. 
module "network_admins" {
  source = "git@github.com:masterwali/tf-module-aws-iam-group.git"

  iam_group_name     = "network-admins"
  policy_name        = "network-admins"
  policy_description = "Network Admins policy"
  policy             = data.aws_iam_policy_document.network_admins_main.json
}
# Create the network admins misc policy 
resource "aws_iam_policy" "network_admins_misc" {
  name        = "network-admins-misc"
  description = "Network Admins"
  policy      = data.aws_iam_policy_document.network_admins_misc.json
}
# Attach the above policy to the network admins group. 
resource "aws_iam_group_policy_attachment" "network_admins_misc" {
  group      = "network-admins"
  policy_arn = aws_iam_policy.network_admins_misc.arn

  depends_on = [aws_iam_policy.network_admins_misc]
}

The “cloud engineers” IAM policy file

data "aws_iam_policy_document" "cloud_engineers" {
  statement {
    sid    = "FullAccess"
    effect = "Allow"
    actions = [
      "iam:*",
      "kms:*",
      "s3:*"
    ]
    resources = ["*"]
  }
}

To see the developers policy and more take a look at the complete code at https://github.com/masterwali/aws-iam

Now let’s apply the code by navigating to the desired environment directory. Then initiate with ‘terragrunt init’ and apply with ‘terragrunt apply’. More on Terragrunt initiation, plans and apply.

Subscribe to get notified when more AWS and Terraform code is published!

TOP 13 CLOUD ENGINEER POSITION INTERVIEW QUESTIONS AND ANSWERSBe prepared for you interview!

IAM

AWS IAM console groups
AWS IAM console: Groups
AWS IAM console group policies
AWS IAM Group policies

AWS KMS Customer Managed CMK with Terraform

AWS Key Management Service (KMS) is a AWS managed service that allows us to create, manage, and delete customer master keys (CMK) or simply use AWS customer managed keys for encrypting our data in the AWS cloud. From my experience with passing both the AWS Certified Security – Speciality and the AWS Certified Solutions Architect – Professional, AWS Key Management Service is a must to learn inside and out. If you can understand all of the KMS features you’ll have a better chance of passing those two exams. Let’s learn how to create and manage AWS KMS customer managed CMK with Terraform! I will also be using Terragrunt so we can follow the DRY (Don’t repeat yourself) model.

This is part two of AWS Key management service (KMS) – Part 1.

KMS key types
KMS Key types

Key properties

  • Key Id
  • Creation date
  • Description
  • Key state

Customer managed keys (CMK)

Given the name you can guess the differences right away, right? For starter, you as the customer will have to explicitly create the key with AWS CLI, AWS API, or Terraform or any other available methods. You can set the CMK policies to allow services or users to use the key. The key policy can pass the permission responsibilities to be managed by IAM policies instead of the KMS CMK key policies. A CMK can be set to enable or disable at any time to allow usage or stop the usage of the key.

Key Alias are a great way to tag and identify Customer managed CMKs. This will help the user quickly find the desired key in the AWS console or AWS CLI query results. Since rotation is possible and it can be enabled to automatically rotate on yearly basis. Key aliases helps with that by re-assigning the alias to the new key.

You can definitely delete the key, but you must be damn sure that no one or any data or services is using that key! Once the key is gone you cannot unencrypt the data that was encrypted with the deleted key! So in order to semi control this chaos AWS has enforced a scheduling delete functionality rather than immediate delete. The customer can delete any Customer managed key by scheduling a delete; the minimum number of days to schedule a delete is 7 days. Best practice is to set this for a month or more.

NOTE: Before proceeding

  • If you haven’t installed Terraform or Terragrunt then you must follow this guide
  • For Terraform deep dive explanation follow this guide

Pricing

Each customer master key (CMK) that you create in AWS Key Management Service (KMS) costs $1/month until you delete it. For the N. VA region:

  • $0.03 per 10,000 requests
  • $0.03 per 10,000 requests involving RSA 2048 keys
  • $0.10 per 10,000 ECC GenerateDataKeyPair requests
  • $0.15 per 10,000 asymmetric requests except RSA 2048
  • $12.00 per 10,000 RSA GenerateDataKeyPair requests

Learn more at https://aws.amazon.com/kms/pricing/

Create and edit KMS Keys

Terraform module: main.tf

# Creates/manages KMS CMK
resource "aws_kms_key" "this" {
  description              = var.description
  customer_master_key_spec = var.key_spec
  is_enabled               = var.enabled
  enable_key_rotation      = var.rotation_enabled
  tags                     = var.tags
  policy                   = var.policy
  deletion_window_in_days  = 30
}

# Add an alias to the key
resource "aws_kms_alias" "this" {
  name          = "alias/${var.alias}"
  target_key_id = aws_kms_key.this.key_id
}

Terraform module: vars.tf

variable description {}

# Options available
# SYMMETRIC_DEFAULT, RSA_2048, RSA_3072,
# RSA_4096, ECC_NIST_P256, ECC_NIST_P384,
# ECC_NIST_P521, or ECC_SECG_P256K1
variable key_spec {
  default = "SYMMETRIC_DEFAULT"
}

variable enabled {
  default = true
}

variable rotation_enabled {
  default = true
}

variable tags {}

variable alias {}

variable policy {}

Terragrunt KMS directory structure

terragrunt directory structure

Terraform plan

main.tf 

locals {
  admin_username = "waleed"
  account_id     = data.aws_caller_identity.current.account_id
}

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "ssm_key" {
  statement {
    sid       = "Enable IAM User Permissions"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${local.account_id}:root"]
    }
  }

  statement {
    sid       = "Allow access for Key Administrators"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

  statement {
    sid    = "Allow use of the key"
    effect = "Allow"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

  statement {
    sid    = "Allow attachment of persistent resources"
    effect = "Allow"
    actions = [
      "kms:CreateGrant",
      "kms:ListGrants",
      "kms:RevokeGrant"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }
}

ssm-key.tf : It’s best practice to have each type of key in its own terraform file. In this example this key is for SSM. You can create keys for any services possible in your region!

Notice the source is only pulling the ‘dev’ branch. Once tested and verified the source branch would change in each environment.

module "ssm" {
  source = "git@github.com:masterwali/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  policy      = data.aws_iam_policy_document.ssm_key.json

  tags = {
    Name  = "ssm"
    Owner = "wsarwari"
  }
}
# module.ssm.aws_kms_alias.this will be created
  + resource "aws_kms_alias" "this" {
      + arn            = (known after apply)
      + id             = (known after apply)
      + name           = "alias/ssm"
      + target_key_arn = (known after apply)
      + target_key_id  = (known after apply)
    }

# module.ssm.aws_kms_key.this will be created
  + resource "aws_kms_key" "this" {
      + arn                      = (known after apply)
      + customer_master_key_spec = "SYMMETRIC_DEFAULT"
      + deletion_window_in_days  = 30
      + description              = "KMS key for System Manager"
      + enable_key_rotation      = true
      + id                       = (known after apply)
      + is_enabled               = true
      + key_id                   = (known after apply)
      + key_usage                = "ENCRYPT_DECRYPT"
      + policy                   = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "kms:*"
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = "arn:aws:iam::111122223334:root"
                        }
                      + Resource  = "*"
                      + Sid       = "Enable IAM User Permissions"
                    },
                  + {
                      + Action    = "kms:*"
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow access for Key Administrators"
                    },
                  + {
                      + Action    = [
                          + "kms:ReEncrypt*",
                          + "kms:GenerateDataKey*",
                          + "kms:Encrypt",
                          + "kms:DescribeKey",
                          + "kms:Decrypt",
                        ]
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow use of the key"
                    },
                  + {
                      + Action    = [
                          + "kms:RevokeGrant",
                          + "kms:ListGrants",
                          + "kms:CreateGrant",
                        ]
                      + Condition = {
                          + Bool = {
                              + kms:GrantIsForAWSResource = "true"
                            }
                        }
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow attachment of persistent resources"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + tags                     = {
          + "Name"  = "ssm"
          + "Owner" = "wsarwari"
        }
    }

Plan: 2 to add, 0 to change, 0 to destroy.

After applying the code you should see the key in Key Management Service.

kms key created

Enable and disable CMKs

Simple, just update the “enabled” variable to false! Plan and apply.

module "ssm" {
  source = "git@github.com:masterwali/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
  enabled     = false

  ....
}

Automatic rotation

Another easy change with Terraform! By default I have set it to rotate the key in the custom module I created above. Now if you want to disable automatic rotation then all you have to do is set the variable rotation_enabled to false.

module "ssm" {
  source = "git@github.com:masterwali/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
  rotation_enabled     = false

  ....
}

Key Alias

Keys are identified by randomly generated GUIDs. It’s best to create in alias for each key so it can be easily identified by everyone. Aliases are also easy to create and update in Terraform. In the custom module above I have added the ability to create the key alias right after the key is provisioned. The key alias is also passed as a variable as shown below.

module "ssm" {
  source = "git@github.com:masterwali/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
}
Type of CMKCan view CMK metadataCan manage CMKUsed only for my AWS accountAutomatic rotation
Customer managed CMKYesYesYesOptional. Every 365 days (1 year).
AWS managed CMKYesNoYesRequired. Every 1095 days (3 years).
AWS owned CMKNoNoNoVaries
This chart is from the AWS documentation

Key Policy

Each key must have a policy that allows or denies the key to be used by users or services. In the design that I have created allows you to give each key a policy. Let’s breakdown each statement of the key. 

Enable IAM User Permissions

statement {
    sid       = "Enable IAM User Permissions"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${local.account_id}:root"]
    }
  }

This statement allows all users and services in this account to execute all KMS actions on this key. It’s best practice to follow up on this open statement by creating an IAM policy to restrict the key usage. The identifiers in the principles section can be other AWS accounts.

Allow access for Key Administrators

statement {
    sid       = "Allow access for Key Administrators"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

This statement allows selected individual users and IAM roles to fully manage this key. This statement must exist for every single key. Without this statement you will absolutely lose access and management of this key.

Allow use of the key

statement {
    sid    = "Allow use of the key"
    effect = "Allow"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

This statement is specifically for the usage of the key. If you do not provide the statement Enable IAM User Permissions then you must include this statement; Otherwise this key will not be usable by anyone besides the key administrators.

Allow attachment of persistent resources

statement {
    sid    = "Allow attachment of persistent resources"
    effect = "Allow"
    actions = [
      "kms:CreateGrant",
      "kms:ListGrants",
      "kms:RevokeGrant"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }

This statement allows listing, creating, and revoking grants for the key by the principals identified in the statement.

Deleting the key

Do not delete production keys! You must be 100% sure no service or data has ever used the key you are able to delete. Once a key is deleted it’s not possible to restore! You cannot decrypt data that was encrypted with the key that you delete. In Terraform you can delete the key by deleting the code or commenting out the code, then Terraform plan and apply. Or if you want to keep the code and just want to remove the resource from AWS then execute Terragrunt destroy. This destroy will also release the alias so it can be reused. 

kms key pending deletion
Key deletion


Status is now pending deletion. It will delete this key 30 days from the day of the destroy. 

Stop KMS key deletion

If you decide to not delete it then on the AWS console you can select the key then click on Key actions. Finally select Cancel key deletion. This option is only available before the deletion date. 

Cancel key deletion
Key actions

Read more about deleting KMS customer managed keys.

KMS Multi-Region Keys

It’s not supported in Terraform, yet. Look at the GitHub open issue for more updates.

As always if you see any errors, mistakes, have suggestions or questions please comment below. Don’t forget to like, share, and subscribe below for more! 

TOP 13 CLOUD ENGINEER POSITION INTERVIEW QUESTIONS AND ANSWERSBe prepared for you interview!

AWS managed CMKs

I talked about AWS managed customer master keys in my previous post here.

Complete Code

Get the complete module from https://github.com/masterwali/terraform-aws-kms-module.