aerial photography of container van lot

Amazon EKS IAM roles and policies with Terraform

Before you use or approve Amazon EKS in production you must have a security checklist. Everyone’s list is different but everyone’s listing must-have items to ensure authentication and authorization are at a minimum; in other words least privilege. Let’s explore Amazon EKS IAM roles and policies written in Terraform!

What are some suggestions to improve your Amazon EKS IAM design?

  • Start with the managed roles and policies, then review AWS CloudTrail logs to see what events or API calls actually occurs
  • Start creating your own managed IAM policies and IAM roles; one at a time
  • If possible require MFA for ensuring the user is who he/she says they are
  • Once you have validated your custom roles and policies then add conditions to your IAM policies; again one condition at a time

Before I show any code it’s important to know basic AWS IAM terminology. Let’s add Identity-based policies Resource-based policies to your vocabulary. Resource-based policies are about “what”. The Identity-based policies are about “who”. Then there’s the “action” that identity or resource can use based on the “effect”. There are dozens of EKS IAM actions available, see the Actions defined by Amazon Elastic Kubernetes Service page.

EKS Cluster Authentication

Just a reminder on how EKS cluster authentication works. Bottomline is that all permissions are essentially managed by Kubernetes Role Based Access Control (RBAC). However AWS IAM is involved.

Prerequisites

Prior to creating your EKS cluster be sure to identify which IAM role or user will be the “primary” identity to create the EKS cluster. The identity that first creates the EKS cluster will be automatically added to K8s system:masters group. Which is great, however you will not be able to visually see that identity in the ConfigMap!

Amazon EKS IAM Resource

The Amazon EKS cluster resource has the following ARN:

arn:${Partition}:eks:${Region}:${Account}:cluster/${ClusterName}

Note: Use a wildcard (“*”) if you really need to specify all clusters. Also you cannot use this resource filter pattern for certain events such as creating a new cluster. How would anyone know that? Well, take a look again at the Actions defined by Amazon Elastic Kubernetes Service page. You’ll notice in the CreateCluster action the Resource box is empty.

eks actions table
I drew a oval shape in the Resource type cell; notice cluster* isn’t listed

IAM policies based on the cluster name

Read/View all clusters: Terraform IAM example

Initially the user or role may not have any EKS permissions, so if you attempt to list all the clusters it would return an error like this.

aws eks list-clusters

An error occurred (AccessDeniedException) when calling the ListClusters operation: User: arn:aws:iam::1234567890:user/read-only-all is not authorized to perform: eks:ListClusters on resource: arn:aws:eks:us-east-2:1234567890:cluster/*

The “eks:ListClusters” Resource must not be restricted so therefore it’s in a different statement than the other actions that do allow the EKS resource ARN. See example below.

{
    "Statement": [
        {
            "Action": [
                "eks:ListUpdates",
                "eks:ListTagsForResource",
                "eks:ListNodegroups",
                "eks:ListIdentityProviderConfigs",
                "eks:ListFargateProfiles",
                "eks:ListAddons",
                "eks:DescribeCluster"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:eks:us-east-2:1234567890:cluster/*",
            "Sid": "ReadAllEKSclusters"
        },
        {
            "Action": "eks:ListClusters",
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "ListAllEKSclusters"
        }
    ],
    "Version": "2012-10-17"
}

Rerun list cluster action with this newly attached policy.

aws eks list-clusters

# My results
{
    "clusters": [
        "aws001-preprod-dev-eks"
    ]
}

The “Describe cluster” action and few others will require the exact name of the cluster you want to describe and because the ARN is set to all (asterisk in the ARN) that means this policy allows describing all clusters.

aws eks describe-cluster --name aws001-preprod-dev-eks

# result

{
    "cluster": {
        "name": "aws001-preprod-dev-eks",
        "arn": "arn:aws:eks:us-east-2:1234567890:cluster/aws001-preprod-dev-eks",
        "createdAt": "2022-03-22T06:37:57.278000-04:00",
        "version": "1.21",
        "endpoint": "https://abc123456789.gr7.us-east-2.eks.amazonaws.com",
        "roleArn": "arn:aws:iam::1234567890:role/aws001-preprod-dev-eks-cluster-role",
        "resourcesVpcConfig": {
            "subnetIds": [
...
}

Read only specific cluster name

This time simply replace the last asterisk in the resource name with the cluster name, in this case I’m using Terraform so I’m passing the name via a local variable.

actions = [
      "eks:AccessKubernetesApi",
      "eks:DescribeCluster",
      "eks:ListAddons",
      "eks:ListFargateProfiles",
      "eks:ListIdentityProviderConfigs",
      "eks:ListNodegroups",
      "eks:ListTagsForResource",
      "eks:ListUpdates"
    ]

    resources = [
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:cluster/${local.cluster_name}",
    ]
  }
aws eks list-clusters --profile iam

An error occurred (AccessDeniedException) when calling the ListClusters operation: User: arn:aws:iam::1234567890:user/Waleed is not authorized to perform: eks:ListClusters on resource: arn:aws:eks:us-east-2:1234567890:cluster/*

Now listing all EKS clusters is not allowed.

Modify all clusters

This role should only be assigned to a few EKS/Kubernetes administrators whose highly experienced or certified to manage Kubernetes clusters in the AWS Cloud. Larger organizations may have a single administrator per cluster while others have one administrator to manage multiple.

statement {
    sid = "ModifyAllEKSclusters"

    actions = [
      "eks:AccessKubernetesApi",
      "eks:Associate*",
      "eks:Create*",
      "eks:Delete*",
      "eks:DeregisterCluster",
      "eks:Describe*",
      "eks:List*",
      "eks:RegisterCluster",
      "eks:TagResource",
      "eks:UntagResource",
      "eks:Update*"
    ]

    resources = [
      "*"
    ]
  }

  statement {
    sid = "Deny"

    # No major updates allowed in this example
    actions = [
      "eks:CreateCluster",
      "eks:DeleteCluster"
    ]

    resources = [
      "*"
    ]
  }

Modify a specific cluster

statement {
    sid = "ModifyaEKScluster"

    actions = [
      "eks:AccessKubernetesApi",
      "eks:Associate*",
      "eks:Create*",
      "eks:Delete*",
      "eks:DeregisterCluster",
      "eks:DescribeCluster",
      "eks:DescribeUpdate",
      "eks:List*",
      "eks:TagResource",
      "eks:UntagResource",
      "eks:Update*"
    ]

    resources = [
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:cluster/${local.cluster_name}",
    ]
  }

  statement {
    sid = "ModifyaEKSclusterResource"

    actions = [
      "eks:DescribeNodegroup",
      "eks:DescribeFargateProfile",
      "eks:DescribeIdentityProviderConfig",
      "eks:DescribeAddon"
    ]

    resources = [
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:cluster/${local.cluster_name}",
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:nodegroup/${local.cluster_name}/*/*",
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:addon/${local.cluster_name}/*/*",
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:identityproviderconfig/${local.cluster_name}/*/*/*",
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:fargateprofile/${local.cluster_name}/*/*"
    ]
  }

  # These actions don't use the 'cluster' resource type
  statement {
    sid = "Modify"

    actions = [
      "eks:RegisterCluster",
      "eks:DisassociateIdentityProviderConfig"
    ]

    resources = [
      "*",
    ]
  }

  statement {
    sid = "Deny"

    # No major updates allowed in this example
    actions = [
      "eks:CreateCluster",
      "eks:DeleteCluster"
    ]

    resources = [
      "*"
    ]
  }

For a complete list of Amazon EKS actions see the original documentation.

Amazon EKS resources ARN

Use these various ARN patterns to constrain permissions for certain teams.

TypeARN pattern
clusterarn:aws:eks:${Region}:${Account}:cluster/${ClusterName}
nodegrouparn:aws:eks:${Region}:${Account}:nodegroup/${ClusterName}/${NodegroupName}/${UUID}
addonarn:aws:eks:${Region}:${Account}:addon/${ClusterName}/${AddonName}/${UUID}
fargateprofilearn:aws:eks:${Region}:${Account}:fargateprofile/${ClusterName}/${FargateProfileName}/${UUID}
identityproviderconfigarn:aws:eks:${Region}:${Account}:identityproviderconfig/${ClusterName}/${IdentityProviderType}/${IdentityProviderConfigName}/${UUID}
EKS resource ARN patterns

EKS IAM Condition Keys

In addition to filtering EKS permissions with the resource name(s), you can further filter down by using keys.

KeyTypeDescription
aws:RequestTag/${TagKey}stringUse this to ensure tags are present in the “request”/create call. Basically before creating new resources.
aws:ResourceTag/${TagKey}stringUse this to find resources that have the tags already. It’s for existing resources with tags.
aws:TagKeysArrayOfStringSimilar to aws:RequestTag/${TagKey} but it’s a list of tag keys, instead of just one.
eks:clientIdstringThe “clientId” value in the associateIdentityProviderConfig call
eks:issuerUrlstringThe “issuerUrl” value in the associateIdentityProviderConfig call
{
            "Sid": "TagEKSWithTheseTags",
            "Effect": "Allow",
            "Action": [
                "eks:CreateCluster",
                "eks:TagResource"
            ],
            "Resource": "*",
            "Condition": {
                "StringEqualsIfExists": {
                    "aws:RequestTag/environment": [
                        "development",
                        "sandbox"
                    ],
                    "aws:RequestTag/jobfunction": "DevOps"
                },
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "environment",
                        "jobfunction"
                    ]
                }
            }
        }

Other IAM policies

EKS Console Admin policy: This permission will allow full read and write to the Configuration tab on the EKS console. The Resources and the Overview tabs requires Kubernetes RBAC permissions.

data "aws_iam_policy_document" "console_admin" {

  statement {
    sid = "admin"

    actions = [
      "eks:*"
    ]

    resources = [
      "*"
    ]
  }

  statement {
    sid = "console"

    effect = "Allow"
    actions = [
      "iam:PassRole"
    ]

    resources = [
      "*"
    ]

    condition {
      test     = "StringEquals"
      variable = "iam:PassedToService"
      values   = ["eks.amazonaws.com"]
    }
  }
}

Update a Kubernetes cluster version: This policy will only allow to update just the Kubernetes Cluster version. In the Terraform example below, updating the cluster version is only allowed when the EKS cluster has a tag of “environment” and a value of “sandbox”; the EKS cluster that’s in the current account and region of course.

data "aws_iam_policy_document" "cluster_version" {

  statement {
    sid = "admin"

    actions = [
      "eks:UpdateClusterVersion"
    ]

    resources = [
      "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:cluster/*"
    ]

    condition {
      test     = "StringEquals"
      variable = "aws:ResourceTag/environment"
      values   = ["sandbox"]
    }
  }
}

EKS Service-linked roles

The following table shows all the service-linked roles that are automatically created when you create the cluster and its components.

ComponentRole nameService URLIAM PolicyIAM Policy ARN
EKS Cluster roleAWSServiceRoleForAmazonEKSeks.amazonaws.comAmazonEKSServiceRolePolicyarn:aws:iam::aws:policy/aws-service-role/AmazonEKSServiceRolePolicy
EKS node groupsAWSServiceRoleForAmazonEKSNodegroupeks-nodegroup.amazonaws.comAWSServiceRoleForAmazonEKSNodegrouparn:aws:iam::aws:policy/aws-service-role/AWSServiceRoleForAmazonEKSNodegroup
EKS Fargate profilesAWSServiceRoleForAmazonEKSForFargateeks-fargate.amazonaws.comAmazonEKSForFargateServiceRolePolicyarn:aws:iam::aws:policy/aws-service-role/AmazonEKSForFargateServiceRolePolicy
EKS ConnectorAWSServiceRoleForAmazonEKSConnectoreks-connector.amazonaws.comAmazonEKSConnectorServiceRolePolicyarn:aws:iam::aws:policy/aws-service-role/AmazonEKSConnectorServiceRolePolicy
EKS service-linked roles

EKS IAM Roles

Amazon EKS Cluster Role

The AmazonEKSClusterPolicy is required to be attached to your EKS Cluster role before you create your cluster.

resource "aws_iam_role" "eks_cluster_role" {
  name = "eks-cluster-role"
  tags = local.required_tags

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "eks_cluster_role" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.eks_cluster_role.name
}

Amazon EKS node IAM role

Each node (EC2 instance for example) uses IAM roles to make AWS API calls. Before you can create and register nodes to the EKS cluster they must have an IAM role with the following policies attached AmazonEKSWorkerNodePolicy and AmazonEC2ContainerRegistryReadOnly. We’ll add the AmazonEKS_CNI_Policy later.

locals {
  eks_node_policies = ["AmazonEC2ContainerRegistryReadOnly", "AmazonEKSWorkerNodePolicy"]
}

resource "aws_iam_role" "eks_node_role" {
  name = "eks-node-role"
  tags = local.required_tags

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "eks_node_role" {
  for_each = toset(local.eks_node_policies)

  policy_arn = "arn:aws:iam::aws:policy/${each.value}"
  role       = aws_iam_role.eks_node_role.name
}

Amazon EKS CNI Policy

You can attach the AmazonEKS_CNI_Policy to the node role above. However, you must follow least privilege model to protect your nodes as much as possible. We’ll need to create the IAM roles for Kubernetes service account or IRSA. There are multiple steps to create it, but we’re in luck because there’s an IRSA Terraform module built by the AWS Open Source community! if you’re using IPV4. There’s an another IRSA Terraform module maintained by the community.

locals {
  addon_context = {
    aws_caller_identity_account_id = data.aws_caller_identity.current.account_id
    aws_caller_identity_arn        = data.aws_caller_identity.current.arn
    aws_eks_cluster_endpoint       = data.aws_eks_cluster.eks_cluster.endpoint
    aws_partition_id               = data.aws_partition.current.partition
    aws_region_name                = data.aws_region.current.name
    eks_oidc_issuer_url            = local.eks_oidc_issuer_url
    eks_cluster_id                 = aws_eks_cluster.this.id
    eks_oidc_provider_arn          = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${local.eks_oidc_issuer_url}"
    tags                           = local.required_tags
  }
}

module "vpc_cni_irsa" {
  source = "git@github.com:aws-ia/terraform-aws-eks-blueprints.git//modules/irsa?ref=v4.2.1"

  kubernetes_namespace              = "kube-system"
  kubernetes_service_account        = "aws-node"
  create_kubernetes_namespace       = false
  create_kubernetes_service_account = false
  irsa_iam_policies                 = ["arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"]
  addon_context                     = local.addon_context
}
ComponentService URLIAM PolicyIAM Policy ARN
EKS Cluster roleeks.amazonaws.comAmazonEKSClusterPolicyarn:aws:iam::aws:policy/AmazonEKSClusterPolicy
EKS node roleec2.amazonaws.comAmazonEKSWorkerNodePolicy

AmazonEC2ContainerRegistryReadOnly
arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
EKS Fargate profileseks-fargate.amazonaws.comAmazonEKSForFargateServiceRolePolicyarn:aws:iam::aws:policy/aws-service-role/AmazonEKSForFargateServiceRolePolicy
EKS Connectoreks-connector.amazonaws.comAmazonEKSConnectorServiceRolePolicyarn:aws:iam::aws:policy/aws-service-role/AmazonEKSConnectorServiceRolePolicy
EKS service-linked roles

EKS Fargate profiles

We cannot use the node IAM role for the EKS Farget profiles, we have to create a pod execution IAM role. Kubernetes Role based access control (RBAC) will use this pod execution IAM role for authorization to AWS services, for example to pull an image from Amazon Elastic Container Registry (ECR). The code below creates the Amazon EKS pod execution IAM role with the required policy and trust settings.

resource "aws_iam_role" "eks_pod_exe_role" {
  name = "eks-fargate-pod-execution-role"
  tags = local.required_tags

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks-fargate-pods.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
         "ArnLike": {
            "aws:SourceArn": "arn:aws:eks:${var.region}:${data.aws_caller_identity.current.account_id}:fargateprofile/${local.cluster_name}/*"
         }
      }
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "eks_pod_exe_role" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy"
  role       = aws_iam_role.eks_pod_exe_role.name
}

EKS Connector

This is a read or view only feature to see your Kubernetes clusters from your other cloud providers or on-premises or running on your EC2s. This also needs a different IAM role.

# #########################################
# EKS Connector
# #########################################

data "aws_iam_policy_document" "connector" {

  statement {
    sid = "SsmControlChannel"

    actions = [
      "ssmmessages:CreateControlChannel"
    ]

    resources = [
      "arn:aws:eks:*:*:cluster/*"
    ]
  }

  statement {
    sid = "ssmDataplaneOperations"

    actions = [
      "ssmmessages:CreateDataChannel",
      "ssmmessages:OpenDataChannel",
      "ssmmessages:OpenControlChannel"
    ]

    resources = ["*"]
  }
}

resource "aws_iam_policy" "connector" {
  name   = "eks-connector"
  path   = "/"
  policy = data.aws_iam_policy_document.connector.json

  tags = {
    "Name" = "eks-connector"
  }
}
resource "aws_iam_role" "eks_connector_role" {
  name = "eks-connector-role"
  tags = local.required_tags

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ssm.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "eks_connector_role" {
  policy_arn = aws_iam_policy.connector.arn
  role       = aws_iam_role.eks_connector_role.name
}

To learn more about AWS managed policies, see https://docs.aws.amazon.com/eks/latest/userguide/security-iam-awsmanpol.html

To see all the code in Terraform, visit the GitHub repo.

If you don’t know how Terraform works, then jump to the Intro to Terraform guide first.

temple illustration

Get started with EC2 Image Builder in Terraform

I can safely assume a lot of engineer’s know of HashCorp’s Packer utility already. Packer is simply an automated virtual machine image template maker, it can create images for all the major cloud providers. It can build Amazon Machine Images (AMI) in AWS or Azure’s Virtual Machine Image. Not too long ago, AWS released their version of automated image builder, called EC2 Image Builder! On this get started with EC2 Image Builder in Terraform I will be showing you how to quickly put together your Terraform code to create an a simple AMI.

Tools

You will need Terraform and if you are deploying this exact code then you’ll need Terragrunt too. Here’s a Setup infrastructure as code environment instructions and if you’re new to Terragrunt then checkout Intro to Terragrunt and Terraform post. I also suggest installing pre-commit too.

EC2 Image Builder Cost

This service doesn’t cost anything but the various resources created by this service could cost you. For example, you have to select your EC2 instance type to run for the during of the AMI creation. It will terminate the EC2 instance once the job is completed. Also, as you know AMI’s have EC2 snapshots, hint cost of storage. You get the point, let’s continue!

Permissions

You will need full permissions on the EC2 Image Builder service.

"imagebuilder:*"

Now the EC2 Image Builder IAM role will need at least the block below. Here I’m creating the policy and role with Terraform. You may need more less, adjust accordingly!

data "aws_iam_policy_document" "image_builder" {
  statement {
    effect = "Allow"
    actions = [
      "ssm:DescribeAssociation",
      "ssm:GetDeployablePatchSnapshotForInstance",
      "ssm:GetDocument",
      "ssm:DescribeDocument",
      "ssm:GetManifest",
      "ssm:GetParameter",
      "ssm:GetParameters",
      "ssm:ListAssociations",
      "ssm:ListInstanceAssociations",
      "ssm:PutInventory",
      "ssm:PutComplianceItems",
      "ssm:PutConfigurePackageResult",
      "ssm:UpdateAssociationStatus",
      "ssm:UpdateInstanceAssociationStatus",
      "ssm:UpdateInstanceInformation",
      "ssmmessages:CreateControlChannel",
      "ssmmessages:CreateDataChannel",
      "ssmmessages:OpenControlChannel",
      "ssmmessages:OpenDataChannel",
      "ec2messages:AcknowledgeMessage",
      "ec2messages:DeleteMessage",
      "ec2messages:FailMessage",
      "ec2messages:GetEndpoint",
      "ec2messages:GetMessages",
      "ec2messages:SendReply",
      "imagebuilder:GetComponent",

    ]
    resources = ["*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:List",
      "s3:GetObject"
    ]
    resources = ["*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "s3:PutObject"
    ]
    resources = ["arn:aws:s3:::${var.aws_s3_log_bucket}/image-builder/*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "logs:CreateLogStream",
      "logs:CreateLogGroup",
      "logs:PutLogEvents"
    ]
    resources = ["arn:aws:logs:*:*:log-group:/aws/imagebuilder/*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "kms:Decrypt"
    ]
    resources = ["*"]
    condition {
      test     = "ForAnyValue:StringEquals"
      variable = "kms:EncryptionContextKeys"

      values = [
        "aws:imagebuilder:arn"
      ]
    }

    condition {
      test     = "ForAnyValue:StringEquals"
      variable = "aws:CalledVia"

      values = [
        "imagebuilder.amazonaws.com"
      ]
    }
  }
}

EC2 Image Builder features

It has automated pipelines! You can set it to build your AMI on a schedule or on-demand. It has a package installer and security components. You can share the AMI across multiple AWS accounts. You can read more about its features at EC2 Image Builder Features.

EC2 Image Builder Pipeline

I’m setting this pipeline to run every Tuesday morning at 8 am. This pipeline will trigger on that schedule and if there are any updates available. I have enabled testing of the image and setting a timeout of 60 minutes.

resource "aws_imagebuilder_image_pipeline" "this" {
  image_recipe_arn                 = aws_imagebuilder_image_recipe.this.arn
  infrastructure_configuration_arn = aws_imagebuilder_infrastructure_configuration.this.arn
  name                             = "amazon-linux-baseline"
  status                           = "ENABLED"
  description                      = "Creates an Amazon Linux 2 image."

  schedule {
    schedule_expression = "cron(0 8 ? * tue)"
    # This cron expressions states every Tuesday at 8 AM.
    pipeline_execution_start_condition = "EXPRESSION_MATCH_AND_DEPENDENCY_UPDATES_AVAILABLE"
  }

  # Test the image after build
  image_tests_configuration {
    image_tests_enabled = true
    timeout_minutes     = 60
  }

  tags = {
    "Name" = "${var.ami_name_tag}-pipeline"
  }
}

EC2 Image Builder Recipe

In the Image Recipe, I’m defining the AMI’s volume size and type, and the components. For this simple example, I’m only installing the CloudWatch agent to the AMI.

resource "aws_imagebuilder_image" "this" {
  distribution_configuration_arn   = aws_imagebuilder_distribution_configuration.this.arn
  image_recipe_arn                 = aws_imagebuilder_image_recipe.this.arn
  infrastructure_configuration_arn = aws_imagebuilder_infrastructure_configuration.this.arn

  depends_on = [
    data.aws_iam_policy_document.image_builder
  ]
}

resource "aws_imagebuilder_image_recipe" "this" {
  block_device_mapping {
    device_name = "/dev/xvdb"

    ebs {
      delete_on_termination = true
      volume_size           = var.ebs_root_vol_size
      volume_type           = "gp3"
    }
  }

  component {
    component_arn = aws_imagebuilder_component.cw_agent.arn
  }

  name         = "amazon-linux-recipe"
  parent_image = "arn:${data.aws_partition.current.partition}:imagebuilder:${data.aws_region.current.name}:aws:image/amazon-linux-2-x86/x.x.x"
  version      = var.image_receipe_version
}

resource "aws_s3_bucket_object" "cw_agent_upload" {
  bucket = var.aws_s3_bucket_object
  key    = "/files/amazon-cloudwatch-agent-linux.yml"
  source = "${path.module}/files/amazon-cloudwatch-agent-linux.yml"
  # If the md5 hash is different it will re-upload
  etag = filemd5("${path.module}/files/amazon-cloudwatch-agent-linux.yml")
}

data "aws_kms_key" "image_builder" {
  key_id = "alias/image-builder"
}

# Amazon Cloudwatch agent component
resource "aws_imagebuilder_component" "cw_agent" {
  name       = "amazon-cloudwatch-agent-linux"
  platform   = "Linux"
  uri        = "s3://${var.aws_s3_bucket_object}/files/amazon-cloudwatch-agent-linux.yml"
  version    = "1.0.0"
  kms_key_id = data.aws_kms_key.image_builder.arn

  depends_on = [
    aws_s3_bucket_object.cw_agent_upload
  ]
}

EC2 Image Builder Infrastructure Configuration

Select the EC2 instance type, the IAM role, security group, subnet, logging bucket, and much more in the infrastructure configuration resource.

resource "aws_imagebuilder_infrastructure_configuration" "this" {
  description           = "Simple infrastructure configuration"
  instance_profile_name = var.ec2_iam_role_name
  instance_types        = ["t2.micro"]
  key_pair              = var.aws_key_pair_name
  name                  = "amazon-linux-infr"
  security_group_ids    = [data.aws_security_group.this.id]

  subnet_id                     = data.aws_subnet.this.id
  terminate_instance_on_failure = true

  logging {
    s3_logs {
      s3_bucket_name = var.aws_s3_log_bucket
      s3_key_prefix  = "image-builder"
    }
  }

  tags = {
    Name = "amazon-linux-infr"
  }
}

EC2 Image Builder Distribution Configuration

Here you can choose to share this AMI with other accounts or it’s just for this account. You can tag the AMI in this resource too.

resource "aws_imagebuilder_distribution_configuration" "this" {
  name = "local-distribution"

  distribution {
    ami_distribution_configuration {
     ami_tags = {
        Project = "IT"
      }

      name = "amzn-linux-{{ imagebuilder:buildDate }}"

      launch_permission {
        user_ids = ["123456789012"]
      }
    }
    region = var.aws_region
  }
}

Tests the AMI

Since I have enabled testing of the AMI EC2 Image Builder, it will create an instance from the AMI automatically.

build and test instances
The build and test instances
The AMI

Complete Code

A lot of other files and code aren’t shown here. You can find a completed working example at https://github.com/masterwali/ec2-image-builder

Get notified when new posts are published, sign up below!

code coding computer cyberspace

AWS Three-Tier VPC with ALB in Terraform

This AWS Three-Tier VPC with ALB in Terraform is the second part of AWS Three-Tier VPC network with Terraform. In the first post I had created many of the VPC components; such as the VPC, app subnets, web subnets, data subnets, route tables for each subnet, internet and NAT gateways, NACLs for each subnet, and a generic security group. In this post I’ll reveal the Terraform code for creating an Elastic Load Balancer, specifically the Application Load Balancer (ALB). The ALB will require a listener, so I’ll add that too. For simplicity purposes the listener will monitor a non-secure port 80. Remember to use certificates and port 443 in production!

Cost

Charges may occur on various resources created using this module. The load balancers have a low hourly price set.

The design

Here’s the original diagram from the previous post.

three-tier vpc diagram

ALB (Application Load Balancer)

The ALB diagram. The ALB does a health check on port 80 on every instance in the target group. Whichever instance is healthy it will use that. In this drawing there’s only one web server; your production should have at least two that are in different availability zones.

alb diagram
The ALB

ALB Terraform Code

# ALB for the web servers
resource "aws_lb" "web_servers" {
  name               = format("%s-alb", var.vpc_name)
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.web.id]
  subnets            = aws_subnet.public.*.id
  enable_http2       = false
  enable_deletion_protection = true

  tags = {
    Name = format("%s-alb", var.vpc_name)
  }
}

This is an example of an external ALB. Note the “internal = false” statement. The load balancer type is what makes this an application load balancer. The other options are network or gateway. Since this is an ALB, it does require a security group. This external load balancer will be required to be in public subnets so outside users can reach the web servers which are hosted in private subnets. That’s one of the design features that makes this a three-tier VPC! See the Terraform documentation on more options.

Target Groups

Let’s make the ALB useful by creating a target group and a load balancer listener.

# Target group for the web servers
resource "aws_lb_target_group" "web_servers" {
  name     = "sharepoint-web-servers-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.this.id
}

resource "aws_lb_listener" "front_end" {
  load_balancer_arn = aws_lb.web_servers.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web_servers.arn
  }
}

An empty target group is not useful either. From now on when you launch a new EC2 web server be sure to add it to the target group like this.

# Find the target group
data "aws_lb_target_group" "web_servers" {
  name = "sharepoint-web-servers-tg"
}

# Attach an EC2 instance to the target group on port 80
resource "aws_lb_target_group_attachment" "web" {
  target_group_arn = aws_lb_target_group.arn
  target_id        = aws_instance.web.id
  port             = 80
}
target group registered targets
EC2 added to a target group

Notice the URL is using the ALB’s DNS record to reach the Nginx web server.

website
Nginx website using the ALB URL

Here’s the complete terraform module code: https://github.com/masterwali/tf-module-aws-three-tier-network-vpc

Here’s an example on how to call or reference the module.

module "sharepoint_network" {
  source = "git@github.com:masterwali/tf-module-aws-three-tier-network-vpc.git"
  
  aws_cli_profile = var.aws_cli_profile
  additional_tags = var.additional_tags
}

Subscribe to the newsletter for more tutorials!

Full Disclosure: I am an AWS employee, this post is opinion of my own.

web text

AWS Three-Tier VPC network with Terraform

A three-tier network is an enterprise architecture to deliver the best performance and security to the end-users. Each component of the design is separated into tiers. Reminder, a typical three-tier network consists of a website then the application then the database from an end-user perspective. Not every website automatically works like that. The developers and engineers have to build the web application by separating the user interface from the logic and data. An AWS three-tier VPC network is not too difficult to build in the cloud, either. However, in this post, I’ll be using Terraform and Terragrunt to build and deploy an AWS three-tier VPC network using, of course, VPC, subnets, route tables, network access control lists (NACLs), and few other VPC parts. Next, I’ll share with you how to create the AWS application load balancer (ALB) and the target groups with health checks in another post.

The design

This will be a simple start and in future posts, I’ll add more details. This AWS three-tier VPC network module will create a VPC, subnets, Network Access Control Lists (NACLs), Internet Gateway, NAT Gateways, route tables, Elastic IPs, and few other resources using Terraform and I’ll deploy it with Terragrunt.

Notice there are two NAT gateways, this provides high availability and fault tolerance. This means if the NAT Gateway in the availability zone (AZ) A fails or gets corrupted then the EC2 instances in AZ B will still be able to function as expected. It’s a little bit more costly… it all depends on your requirements.

The Terraform Module

This module will be generic so I can reuse the three-tier VPC network over and over again. By creating a module it makes my main code a lot less, therefore it will be a lot cleaner to view and understand.

What’s Terraform and Terragrunt? Well visit my Intro to Terragrunt and Terraform post first then come back here! You can name your module anything you like… I named my three-tier-vpc. Reminder in Terraform we can use one or more .tf files to build a module. I’ll be separating this module into few different Terraform files just for organizational purposes. Here’s my structure.

aws/tf-modules/three-tier-vpc
               ├── README.md
               ├── gateways.tf
               ├── main.tf
               ├── nacls.tf
               ├── outputs.tf
               ├── routes.tf
               ├── sec-grps.tf
               └── vars.tf

AWS VPC

Let’s start with the main.tf file which contains the VPC resource. The VPC CIDR will be a variable so we can plugin any CIDR during deployment.

# Create the VPC
resource "aws_vpc" "this" {
  cidr_block           = var.vpc_cidr
  instance_tenancy     = "default"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = merge(
    var.additional_tags,
    {
      Name = "${var.vpc_name}-vpc"
    }
  )
}

In my module I’m giving the vpc_cidr a default value but you’re not required to do so. This default won’t hurt because if this network range is already taken then terraform apply will fail cleanly.

# vars.tf
variable "vpc_cidr" {
  type    = string
  default = "10.0.0.0/16"
}

The public subnet CIDR blocks will be variable too. Never hardcode values in modules that could or should be changed. The public_subnet_cidrs is a list of one or many CIDR blocks. If we provide one string value in the block then it will create one public subnet. If we provide four CIDR blocks then it will create four, how sweet is that!

# Create the public subnets
resource "aws_subnet" "public" {
  count = length(var.public_subnet_cidrs)

  vpc_id                  = aws_vpc.this.id
  cidr_block              = var.public_subnet_cidrs[count.index]
  availability_zone       = "${var.aws_region}${var.zones[count.index]}"
  map_public_ip_on_launch = true

  tags = {
    Name = "${var.vpc_name}-public-subnet-${var.zones[count.index]}"
  }
}

The map_public_ip_on_launch attribute set to true is one of the configurations that makes this a public subnet. Just because we named it public doesn’t make it public. Next are the private subnets. Note the map_public_ip_on_launch is now set to false.

# Create the web subnets
resource "aws_subnet" "web" {
  count = length(var.web_subnet_cidrs)

  vpc_id                  = aws_vpc.this.id
  cidr_block              = var.web_subnet_cidrs[count.index]
  availability_zone       = "${var.aws_region}${var.zones[count.index]}"
  map_public_ip_on_launch = false

  tags = {
    Name = "${var.vpc_name}-web-subnet-${var.zones[count.index]}"
  }
}

The rest of the main.tf file contains the resources for the data and the app subnets, these subnets are non-public too.

VPC Gateways

In this three-tier VPC network architecture, we have only a few public subnets (meaning the instances launched in the public subnets will get Public IPs; which makes them routable without a NAT gateway). The public subnets will need an internet gateway to complete the design. The private or non-public subnets will need a NAT gateway to allow instances with just private IPs to communicate to the internet (AKA outbound internet access). NAT gateways need 1) a public IP, 2) added to your public subnet for each availability zone you are using!

# Add internet gateway
resource "aws_internet_gateway" "this" {
  vpc_id = aws_vpc.this.id

  tags = {
    Name = "${var.vpc_name}-internet-gateway"
  }
}

# Charges may occur

# Reserve EIPs
resource "aws_eip" "nat_a" {
  vpc = true

  tags = {
    Name = "${var.vpc_name}-eip-nat-a"
  }

}

# NAT Gateway in AZ A
resource "aws_nat_gateway" "zone_a" {
  allocation_id = aws_eip.nat_a.id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.vpc_name}-nat-gateway-aza"
  }

  depends_on = [
    aws_subnet.public
  ]
}

# Reserve EIPs
resource "aws_eip" "nat_b" {
  vpc = true

  tags = {
    Name = "${var.vpc_name}-eip-nat-b"
  }

}

# NAT Gateway in AZ B
resource "aws_nat_gateway" "zone_b" {
  allocation_id = aws_eip.nat_b.id
  subnet_id     = aws_subnet.public[1].id

  tags = {
    Name = "${var.vpc_name}-nat-gateway-azb"
  }

  depends_on = [
    aws_subnet.public
  ]
}

Note one of the comments in the code, EIP’s and the gateways may cost you for the duration of their existence.

VPC Routes

Every time a VPC is created the main route table is automatically provisioned, too. I’m going to tag it and create my own route tables for this module. Do note I do have some hardcoded values in this module. I have decided this will be for two AZ’s and no more so I hardcoded the index value of zero and one in the resources below.

# Tag the main route table
resource "aws_ec2_tag" "main_route_table" {
  resource_id = aws_vpc.this.main_route_table_id
  key         = "Name"
  value       = "${var.vpc_name}-main-route-table"
}

# Create route table for the public subnets
# Uses IG
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.this.id
  }

  tags = {
    Name = "${var.vpc_name}-public-route-table"
  }

  depends_on = [
    aws_internet_gateway.this
  ]
}

# Associate the public subnets with the public route table
resource "aws_route_table_association" "public" {
  count = length(var.public_subnet_cidrs)

  subnet_id      = element(aws_subnet.public.*.id, count.index)
  route_table_id = aws_route_table.public.id
}

# Create a route table for the web and app subnets in AZ A
# Uses NAT gateway in AZ A
resource "aws_route_table" "private_aza" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.zone_a.id
  }

  tags = {
    Name = "${var.vpc_name}-private-route-table-aza"
  }

  depends_on = [
    aws_nat_gateway.zone_a
  ]
}

# Create a route table for the web and app subnets in AZ B
# Uses NAT gateway in AZ B
resource "aws_route_table" "private_azb" {
  vpc_id = aws_vpc.this.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.zone_b.id
  }

  tags = {
    Name = "${var.vpc_name}-private-route-table-azb"
  }

  depends_on = [
    aws_nat_gateway.zone_b
  ]
}

resource "aws_route_table" "data" {
  vpc_id = aws_vpc.this.id

  tags = {
    Name = "${var.vpc_name}-data-route-table"
  }
}

# Associate these subnets with the private route tables accordingly 
resource "aws_route_table_association" "web_aza" {
  subnet_id      = aws_subnet.web[0].id
  route_table_id = aws_route_table.private_aza.id
}

resource "aws_route_table_association" "app_aza" {
  subnet_id      = aws_subnet.app[0].id
  route_table_id = aws_route_table.private_aza.id
}

resource "aws_route_table_association" "web_azb" {
  subnet_id      = aws_subnet.web[1].id
  route_table_id = aws_route_table.private_azb.id
}

resource "aws_route_table_association" "app_azb" {
  subnet_id      = aws_subnet.app[1].id
  route_table_id = aws_route_table.private_azb.id
}

resource "aws_route_table_association" "data" {
  count = length(var.data_subnet_cidrs)

  subnet_id      = element(aws_subnet.data.*.id, count.index)
  route_table_id = aws_route_table.data.id
}

The routes are another set of configurations that differentiates public and internal sub-network. In the public subnet route table, all non-local traffic will be sent to the internet gateway. A private route table sends its non-local traffic to the NAT gateway which then routes it to the internet gateway and back.

VPC NACLs

Now it’s time to control which traffic is allowed or denied to this network. You may need to modify these rules to meet your requirements. For most web application projects, the public NACLs will most likely look like this.

# Public NACLS
resource "aws_network_acl" "public" {
  vpc_id     = aws_vpc.this.id
  subnet_ids = [aws_subnet.public[0].id, aws_subnet.public[1].id]

  # Ingress rules
  # Allow all local traffic
  ingress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = aws_vpc.this.cidr_block
    from_port  = 0
    to_port    = 0
  }

  # Allow HTTPS traffic from the internet
  ingress {
    protocol   = "6"
    rule_no    = 105
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 443
    to_port    = 443
  }

  # Allow HTTP traffic from the internet
  ingress {
    protocol   = "6"
    rule_no    = 110
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 80
    to_port    = 80
  }

  # Allow the ephemeral ports from the internet
  ingress {
    protocol   = "6"
    rule_no    = 120
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  ingress {
    protocol   = "17"
    rule_no    = 125
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  # Egress rules
  # Allow all ports, protocols, and IPs outbound
  egress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  tags = {
    Name = "${var.vpc_name}-public-nacl"
  }

  depends_on = [aws_subnet.public]
}

The web subnet NACLs for this module.

resource "aws_network_acl" "web" {
  vpc_id     = aws_vpc.this.id
  subnet_ids = [aws_subnet.web[0].id, aws_subnet.web[1].id]

  # Ingress rules
  # Allow all local traffic
  ingress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = aws_vpc.this.cidr_block
    from_port  = 0
    to_port    = 0
  }

  # Allow HTTP web traffic from anywhere
  ingress {
    protocol   = 6
    rule_no    = 105
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 80
    to_port    = 80
  }

  # Allow HTTPS web traffic from anywhere
  ingress {
    protocol   = 6
    rule_no    = 110
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 443
    to_port    = 443
  }

  # Allow the ephemeral ports from the internet
  ingress {
    protocol   = "6"
    rule_no    = 120
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  ingress {
    protocol   = "17"
    rule_no    = 125
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1025
    to_port    = 65534
  }

  # Egress rules
  # Allow all ports, protocols, and IPs outbound
  egress {
    protocol   = -1
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  tags = {
    Name = "${var.vpc_name}-web-nacl"
  }
}

The App and Data subnet NACLs have been set to allow only local traffic. You may adjust these rules.

VPC Security Group

A default security group is created every time a new VPC is provisioned. Here I’ll just give it some tags and few generic rules.

# Modify the default security group
resource "aws_default_security_group" "this" {
  vpc_id = aws_vpc.this.id

  dynamic "ingress" {
    for_each = var.default_security_group_ingress
    content {
      self        = lookup(ingress.value, "self", null)
      cidr_blocks = compact(split(",", lookup(ingress.value, "cidr_blocks", "")))
      description = lookup(ingress.value, "description", null)
      from_port   = lookup(ingress.value, "from_port", 0)
      to_port     = lookup(ingress.value, "to_port", 0)
      protocol    = lookup(ingress.value, "protocol", "-1")
    }
  }

  dynamic "egress" {
    for_each = var.default_security_group_egress
    content {
      self        = lookup(egress.value, "self", null)
      cidr_blocks = compact(split(",", lookup(egress.value, "cidr_blocks", "")))
      description = lookup(egress.value, "description", null)
      from_port   = lookup(egress.value, "from_port", 0)
      to_port     = lookup(egress.value, "to_port", 0)
      protocol    = lookup(egress.value, "protocol", "-1")
    }
  }

  tags = merge(
    {
      Name = format("%s-default-security-group", var.vpc_name)
    },
    var.additional_tags
  )
}

Now the values for this security group are passed as a variable like so. Be sure to change the ports and protocols to meet your needs.

variable "default_security_group_ingress" {
  description = "List of maps of ingress rules to set on the default security group"
  type        = list(map(string))
  default = [
    {
      cidr_blocks = "10.0.0.0/16"
      description = "Allow all from the local network."
      from_port   = 0
      protocol    = "-1"
      self        = false
      to_port     = 0
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all HTTPS from the internet."
      from_port   = 443
      protocol    = "6"
      self        = false
      to_port     = 443
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all HTTP from the internet."
      from_port   = 80
      protocol    = "6"
      self        = false
      to_port     = 80
    },
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all ephemeral ports from the internet."
      from_port   = 32768
      protocol    = "6"
      self        = false
      to_port     = 60999
    }
  ]
}

variable "default_security_group_egress" {
  description = "List of maps of egress rules to set on the default security group"
  type        = list(map(string))
  default = [
    {
      cidr_blocks = "0.0.0.0/0"
      description = "Allow all"
      from_port   = 0
      protocol    = "-1"
      self        = false
      to_port     = 0
    }
  ]
}

Here’s a link to the complete code so far in the dev branch. That’s it all for now! In the next post, we’ll add application load balancers, target groups, listeners, etc. For a more in-depth explanation of VPC resources checkout AWS technical documentation. Be sure to subscribe for more content like this!

AWS Service Control Policies with Terraform

AWS Organizations

A cloud service designed to centralize & manage AWS accounts and to roll up billing from multiple AWS accounts into a single account. May be referred to as the “master” account because it can manage permissions of all its accounts that are “attached” to it. “Billing” is another name for this account because it’s the account that gets the invoice or the monthly charge. You can select any commercial AWS account to play this role. The account becomes this master account as soon as you join the master’s accounts organization. Next comes AWS’s Service Control Policies, this feature allows permission management for all your AWS accounts in your organization. In this post, I’ll share with you how to implement AWS’s service control policies with Terraform! Let’s break it down.

How to join an AWS Organization

Check out my previous post on the details.

AWS Organization Units

This has nothing to do with Microsoft’s Active Directory (AD) organization units. There’s no integration between MS AD and this, either. AWS’s organization units are a way of grouping AWS accounts so we can manage accounts permissions in groups or in a hierarchical format. There are dozens of ways to create this hierarchy and it all depends on your objective. You can group them by projects, departments, missions, environments, classifications, etc.

Simple OU structure by environments

our simple structure
AWS OU by environments

Service Control Policies with Terraform

Service Control Policies (SCP) is a critical feature to learn and understand. Questions related to this feature is a topic on many, many AWS certifications. This feature is highly useful and it’s used a lot in the real world. It’s these policies that can allow or deny actions or services at a high level. What I mean by “high level” is outside of the AWS’s account. For example; let’s say you want to experiment with the most expensive EC2 instance type and let’s say you also have IAM permissions to allow those actions. Then you try to launch an EC2 with an instance type of p4d.24xlarge and bam you get the encoded authorized failure message! How?! You have given yourself full permissions! Bingo… it’s the service control policies. Even with a full administration policy on the account, you can still be denied by SCP.

It’s best practice to enable SCP and create the OUs, and policies and attach the policies even before building your systems! For information on SCP checkout AWS’s documents.

Show me the Terraform code!

I personally like to break down the main.tf terraform file into separate manageable terraform files. Here’s my file structure. Notice there’s only one environment aka account “master” here. Yes, I also use terragrunt.

org/
├── README.md
├── dev-ou.tf
├── main.tf
├── master
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── prod-ou.tf
├── root-ou.tf
├── staging-ou.tf
└── terragrunt.hcl

main.tf this file usually contains common code. Enable SCP by adding the “SERVICE_CONTROL_POLICY” to the enabled_policy_types array.

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

 terraform {
   backend "s3" {}
 }

# Provides a resource to create an AWS organization.
resource "aws_organizations_organization" "this" {

  # List of AWS service principal names for which 
  # you want to enable integration with your organization
  aws_service_access_principals = [
    "cloudtrail.amazonaws.com",
    "config.amazonaws.com",
  ]

  feature_set = "ALL"

  enabled_policy_types = [
    "TAG_POLICY",
    "SERVICE_CONTROL_POLICY"
  ]
}

my root-out.tf contains the master account code and all the service control policies that I want to be applied to all accounts. Notice it’s attached to the root OU which then it’s inherited by all the accounts below the root OU!

resource "aws_organizations_account" "master" {
  # A friendly name for the member account
  name  = "my-master"
  email = "mymaster@email.com"

  # Enables IAM users to access account billing information 
  # if they have the required permissions
  # iam_user_access_to_billing = "ALLOW"

  tags = {
    Name  = "my-master"
    Owner = "Waleed"
    Role  = "billing"
  }

  parent_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------------------- # 
# Service Control Policies for all accounts
# ---------------------------------------- #

# ---------------------------- #
# REGION RESTRICTION 
# ---------------------------- #

data "aws_iam_policy_document" "restrict_regions" {
  statement {
    sid       = "RegionRestriction"
    effect    = "Deny"
    actions   = ["*"]
    resources = ["*"]

    condition {
      test     = "StringNotEquals"
      variable = "aws:RequestedRegion"

      values = [
        "us-east-1"
      ]
    }
  }
}

resource "aws_organizations_policy" "restrict_regions" {
  name        = "restrict_regions"
  description = "Deny all regions except US East 1."
  content     = data.aws_iam_policy_document.restrict_regions.json
}

resource "aws_organizations_policy_attachment" "restrict_regions_on_root" {
  policy_id = aws_organizations_policy.restrict_regions.id
  target_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------- #
# EC2 INSTANCE TYPE RESTRICTION 
# ---------------------------- #

data "aws_iam_policy_document" "restrict_ec2_types" {
  statement {
    sid       = "RestrictEc2Types"
    effect    = "Deny"
    actions   = ["ec2:RunInstances"]
    resources = ["arn:aws:ec2:*:*:instance/*"]

    condition {
      test     = "StringNotEquals"
      variable = "ec2:InstanceType"

      values = [
        "t3*",
        "t4g*",
        "a1.medium",
        "a1.large"
      ]
    }
  }
}

resource "aws_organizations_policy" "restrict_ec2_types" {
  name        = "restrict_ec2_types"
  description = "Allow certain EC2 instance types only."
  content     = data.aws_iam_policy_document.restrict_ec2_types.json
}

resource "aws_organizations_policy_attachment" "restrict_ec2_types_on_root" {
  policy_id = aws_organizations_policy.restrict_ec2_types.id
  target_id = aws_organizations_organization.this.roots[0].id
}

# ---------------------------- #
# REQUIRE EC2 TAGS 
# ---------------------------- #

data "aws_iam_policy_document" "require_ec2_tags" {
  statement {
    sid    = "RequireTag"
    effect = "Deny"
    actions = [
      "ec2:RunInstances",
      "ec2:CreateVolume"
    ]
    resources = [
      "arn:aws:ec2:*:*:instance/*",
      "arn:aws:ec2:*:*:volume/*"
    ]

    condition {
      test     = "Null"
      variable = "aws:RequestTag/Name"

      values = ["true"]
    }
  }
}

resource "aws_organizations_policy" "require_ec2_tags" {
  name        = "require_ec2_tags"
  description = "Name tag is required for EC2 instances and volumes."
  content     = data.aws_iam_policy_document.require_ec2_tags.json
}

resource "aws_organizations_policy_attachment" "require_ec2_tags_on_root" {
  policy_id = aws_organizations_policy.require_ec2_tags.id
  target_id = aws_organizations_organization.this.roots[0].id
}

Here’s an authorization failure message when I attempted to create an EC2 with an instance type that was not approved in the SCP defined above!

The encoded authorization failure message from the console.
Decoded message shows the deny statement from the SCP
Failed to create the volume because it now requires a Name tag in order to create a volume.
Name tag denied statement from SCP is shown once the failure message is decoded.

Account or OU specific SCP

What if you want to restrict certain actions or services on a single account. The code below shows how to block the internet in this environment. This says to create the prod account, create the SCP and only attach it directly to this OU; and this OU only has the prod account. To attach an SCP to an account only check the documentation.

prod-ou.tf

resource "aws_organizations_account" "prod" {
  # A friendly name for the member account
  name  = "my-prod"
  email = "my-prod@email.com"

  # Enables IAM users to access account billing information 
  # if they have the required permissions
  iam_user_access_to_billing = "ALLOW"

  tags = {
    Name  = "my-prod"
    Owner = "Waleed"
    Role  = "prod"
  }

  parent_id = aws_organizations_organizational_unit.prod.id
}

resource "aws_organizations_organizational_unit" "prod" {
  name      = "prod"
  parent_id = aws_organizations_organization.this.roots[0].id
}

# ------------------------------- #
# PREVENT INTERNET ACCESS TO A VPC 
# ------------------------------- #

data "aws_iam_policy_document" "block_internet" {
  statement {
    sid    = "BlockInternet"
    effect = "Deny"
    actions = [
      "ec2:AttachInternetGateway",
      "ec2:CreateInternetGateway",
      "ec2:CreateEgressOnlyInternetGateway",
      "ec2:CreateVpcPeeringConnection",
      "ec2:AcceptVpcPeeringConnection",
      "globalaccelerator:Create*",
      "globalaccelerator:Update*"
    ]
    resources = ["*"]

  }
}

resource "aws_organizations_policy" "block_internet" {
  name        = "block_internet"
  description = "Block internet access to the production network."
  content     = data.aws_iam_policy_document.block_internet.json
}

resource "aws_organizations_policy_attachment" "block_internet_on_prod" {
  policy_id = aws_organizations_policy.block_internet.id
  target_id = aws_organizations_organizational_unit.prod.id
}

I think you get the idea, go forth and implement governance with AWS’s organizations and Service Control Policies! Subscribe for more cloud code!

Create an EC2 IAM role with Terraform

There are hundreds of posts online about what is EC2 IAM role so I’m not going to discuss that here. Here we will develop the code to create an EC2 IAM role with Terraform and deploy with Terragrunt. I’m assuming you know what both of these tools are… if not checkout my introduction posts on how to setup your development environment. Or if you need a introduction to Terraform modules and how to use them click here. Do note that an EC2 IAM role can only be used for EC2 instances. The IAM policies can be shared with other resources or services though.

This Terraform module creates AWS IAM policy then creates IAM role specifically designed to be used by EC2 instances. After that it attaches the IAM role to the EC2 instance profile. Lastly attaches the IAM policy to the EC2 IAM role. Remember every IAM role needs a set of policies (permissions).

Terraform EC2 IAM role module

ece iam role Terraform module
Module structure

Here’s the main.tf file of the module.

# Create the AWS IAM role. 
resource "aws_iam_role" "this" {
  name = var.ec2_iam_role_name
  path = "/"

  assume_role_policy = var.assume_role_policy
}

# Create AWS IAM instance profile
# Attach the role to the instance profile
resource "aws_iam_instance_profile" "this" {
  name = var.ec2_iam_role_name
  role = aws_iam_role.this.name
}

# Create a policy for the role
resource "aws_iam_policy" "this" {
  name        = var.ec2_iam_role_name
  path        = "/"
  description = var.policy_description
  policy      = var.policy
}

# Attaches the policy to the IAM role
resource "aws_iam_policy_attachment" "this" {
  name       = var.ec2_iam_role_name
  roles      = [aws_iam_role.this.name]
  policy_arn = aws_iam_policy.this.arn
}

As always there’s a vars.tf

variable ec2_iam_role_name {
  type = string

  validation {
    condition     = length(var.ec2_iam_role_name) > 4 && substr(var.ec2_iam_role_name, 0, 4) == "svc-"
    error_message = "The ec2_iam_role_name value must be a valid IAM role name, starting with \"svc-\"."
  }
}

variable policy_description {
  type = string

  validation {
    condition     = length(var.policy_description) > 4
    error_message = "The policy_description value must contain more than 4 characters."
  }
}

variable assume_role_policy {}

variable policy {}

The module full code: https://github.com/masterwali/tf-module-iam-ec2-role

Terraform variable type constraints

If you noticed the “type” keyword in the variable declaration, that’s to ensure the kind of answer matches exactly what the module expects. This way someone doesn’t give an integer to a string type variable, it may break the Terraform apply but it may not error during Terraform plan. So we want to catch problems like that as soon as possible with type constraints. Now all the validations are done during the Terraform plan.

Terraform variable validation

Terraform starting with version 0.13.x released this new capability to apply validation on the answers provided to the variables. If you noticed the validation blocks above; the ‘ec2_iam_role_name’ var checks for both the length and what should be the starting characters. This is excellent way to ensure everyone’s following the naming conventions your organizations have created. The second validation on the ‘policy_description’ is just ensuring the provided value is more than 4 characters. Learn more about variable conditions.

How to use the module

In another git project we’ll have this setup.

Terragrunt ece iam role setup
Terragrunt setup
Each environment has a inputs.yml, terragrunt.hcl and vars.tf

The main.tf that will call the Terraform module. Be sure to update your git code. Now we create many, many EC2 IAM roles with the same naming convention! By default the source variable will use the master branch!

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

module "web_server" {
  source = "git@gitlab.com:cloudly-engineer/aws/tf-modules/iam-ec2-role.git"

  ec2_iam_role_name  = "svc-web-server-role"
  policy_description = "IAM ec2 instance profile for the web servers."
  assume_role_policy = file("assumption-policy.json")
  policy             = data.aws_iam_policy_document.web_server.json
}

Here’s the sample “web server” IAM policy that’s attached to this role.

data "aws_iam_policy_document" "web_server" {
  statement {
    sid    = "GetS3Stuff"
    effect = "Allow"
    actions = [
      "s3:List*",
      "s3:Get*"
    ]
    resources = ["*"]
  }
}

Then do a terragrunt init, plan and apply!

Here’s what the error message looks like when the validation fails during the plan.

Terraform ec2 iam role module
Terraform validation check

The calling terraform/terragrunt code: https://github.com/masterwali/ec2-iam-role

Subscribe to get notified when more AWS and Terraform code is published!

AWS IAM groups and policies – Terraform

Let’s create a module to create and manage AWS IAM groups and policies with Terraform! In summary a Terraform module is one or more Terraform resources bundled together to be used a single Terraform resource. You can learn more about Terraform module’s here. If you have no idea what Terraform or Terragrunt then start here. Let’s code!

The Terraform module structure

aws-iam-group/
main.tf
vars.tf
README.md

The main file

The main.tf contains all the resources required to create AWS IAM groups and their policies. Notice this one uses three resources!

resource "aws_iam_group" "this" {
name = var.iam_group_name
}

resource "aws_iam_policy" "this" {
name = var.policy_name
description = var.policy_description
policy = var.policy
}

resource "aws_iam_group_policy_attachment" "this" {
group = aws_iam_group.this.name
policy_arn = aws_iam_policy.this.arn
}

The vars file

variable iam_group_name {}

variable policy_name {}

variable policy_description {}

variable policy {}

Here’s the AWS IAM groups and policies Terraform module on GitHub: https://github.com/masterwali/tf-module-aws-iam-group

Create AWS IAM groups and policies with Terraform

Let’s use the Terraform module to create one or many IAM groups and their policies! You can deploy with Terraform but I like to use Terragrunt.

The Terragrunt/Terraform structure

This is one way to create your deployment structure. You can name this directory anything you like.

aws-iam
├── LICENSE
├── README.md
├── cloud-engineers-policy.tf
├── database-admins-policy.tf
├── developers-policy.tf
├── network-admins-policies.tf
├── groups.tf
├── main.tf
├── dev
│   ├── inputs.yml
│   ├── terragrunt.hcl
│   └── vars.tf
├── prod
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
├── qa
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
├── sec
│   └── terragrunt.hcl
│   ├── inputs.yml
│   └── vars.tf
└── terragrunt.hcl

In this example we will be creating groups and policies for developers, cloud engineers, database admins, and network admins all with one and same Terraform module! Using a module will ensure consistency and governance for our AWS resources.

The Terraform files

I like to keep the main file clean. I mainly use it for data calls and local variables.

main.tf

locals {
  account_id = data.aws_caller_identity.current.account_id
}

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

data "aws_caller_identity" "current" {}

groups.tf

# CLOUD ENGINEERS
module "cloud_engineers" {
  source = "git@github.com:masterwali/tf-module-aws-iam-group.git"

  iam_group_name     = "cloud-engineers"
  policy_name        = "cloud-engineers"
  policy_description = "Cloud Engineers policy"
  policy             = data.aws_iam_policy_document.cloud_engineers.json
}
# ......... etc. 

# NETWORK ADMINS
# Modules creates group and policy and attaches policy to group. 
module "network_admins" {
  source = "git@github.com:masterwali/tf-module-aws-iam-group.git"

  iam_group_name     = "network-admins"
  policy_name        = "network-admins"
  policy_description = "Network Admins policy"
  policy             = data.aws_iam_policy_document.network_admins_main.json
}
# Create the network admins misc policy 
resource "aws_iam_policy" "network_admins_misc" {
  name        = "network-admins-misc"
  description = "Network Admins"
  policy      = data.aws_iam_policy_document.network_admins_misc.json
}
# Attach the above policy to the network admins group. 
resource "aws_iam_group_policy_attachment" "network_admins_misc" {
  group      = "network-admins"
  policy_arn = aws_iam_policy.network_admins_misc.arn

  depends_on = [aws_iam_policy.network_admins_misc]
}

The “cloud engineers” IAM policy file

data "aws_iam_policy_document" "cloud_engineers" {
  statement {
    sid    = "FullAccess"
    effect = "Allow"
    actions = [
      "iam:*",
      "kms:*",
      "s3:*"
    ]
    resources = ["*"]
  }
}

To see the developers policy and more take a look at the complete code at https://github.com/masterwali/aws-iam

Now let’s apply the code by navigating to the desired environment directory. Then initiate with ‘terragrunt init’ and apply with ‘terragrunt apply’. More on Terragrunt initiation, plans and apply.

Subscribe to get notified when more AWS and Terraform code is published!

IAM

AWS IAM console groups
AWS IAM console: Groups
AWS IAM console group policies
AWS IAM Group policies

AWS KMS Customer Managed CMK with Terraform

AWS Key Management Service (KMS) is a AWS managed service that allows us to create, manage, and delete customer master keys (CMK) or simply use AWS customer managed keys for encrypting our data in the AWS cloud. From my experience with passing both the AWS Certified Security – Speciality and the AWS Certified Solutions Architect – Professional, AWS Key Management Service is a must to learn inside and out. If you can understand all of the KMS features you’ll have a better chance of passing those two exams. Let’s learn how to create and manage AWS KMS customer managed CMK with Terraform! I will also be using Terragrunt so we can follow the DRY (Don’t repeat yourself) model.

This is part two of AWS Key management service (KMS) – Part 1.

KMS key types
KMS Key types

Key properties

  • Key Id
  • Creation date
  • Description
  • Key state

Customer managed keys (CMK)

Given the name you can guess the differences right away, right? For starter, you as the customer will have to explicitly create the key with AWS CLI, AWS API, or Terraform or any other available methods. You can set the CMK policies to allow services or users to use the key. The key policy can pass the permission responsibilities to be managed by IAM policies instead of the KMS CMK key policies. A CMK can be set to enable or disable at any time to allow usage or stop the usage of the key.

Key Alias are a great way to tag and identify Customer managed CMKs. This will help the user quickly find the desired key in the AWS console or AWS CLI query results. Since rotation is possible and it can be enabled to automatically rotate on yearly basis. Key aliases helps with that by re-assigning the alias to the new key.

You can definitely delete the key, but you must be damn sure that no one or any data or services is using that key! Once the key is gone you cannot unencrypt the data that was encrypted with the deleted key! So in order to semi control this chaos AWS has enforced a scheduling delete functionality rather than immediate delete. The customer can delete any Customer managed key by scheduling a delete; the minimum number of days to schedule a delete is 7 days. Best practice is to set this for a month or more.

NOTE: Before proceeding

  • If you haven’t installed Terraform or Terragrunt then you must follow this guide
  • For Terraform deep dive explanation follow this guide

Pricing

Each customer master key (CMK) that you create in AWS Key Management Service (KMS) costs $1/month until you delete it. For the N. VA region:

  • $0.03 per 10,000 requests
  • $0.03 per 10,000 requests involving RSA 2048 keys
  • $0.10 per 10,000 ECC GenerateDataKeyPair requests
  • $0.15 per 10,000 asymmetric requests except RSA 2048
  • $12.00 per 10,000 RSA GenerateDataKeyPair requests

Learn more at https://aws.amazon.com/kms/pricing/

Create and edit KMS Keys

Terraform module: main.tf

# Creates/manages KMS CMK
resource "aws_kms_key" "this" {
  description              = var.description
  customer_master_key_spec = var.key_spec
  is_enabled               = var.enabled
  enable_key_rotation      = var.rotation_enabled
  tags                     = var.tags
  policy                   = var.policy
  deletion_window_in_days  = 30
}

# Add an alias to the key
resource "aws_kms_alias" "this" {
  name          = "alias/${var.alias}"
  target_key_id = aws_kms_key.this.key_id
}

Terraform module: vars.tf

variable description {}

# Options available
# SYMMETRIC_DEFAULT, RSA_2048, RSA_3072,
# RSA_4096, ECC_NIST_P256, ECC_NIST_P384,
# ECC_NIST_P521, or ECC_SECG_P256K1
variable key_spec {
  default = "SYMMETRIC_DEFAULT"
}

variable enabled {
  default = true
}

variable rotation_enabled {
  default = true
}

variable tags {}

variable alias {}

variable policy {}

Terragrunt KMS directory structure

terragrunt directory structure

Terraform plan

main.tf 

locals {
  admin_username = "waleed"
  account_id     = data.aws_caller_identity.current.account_id
}

provider "aws" {
  region  = var.aws_region
  profile = var.aws_cli_profile
}

terraform {
  backend "s3" {}
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "ssm_key" {
  statement {
    sid       = "Enable IAM User Permissions"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${local.account_id}:root"]
    }
  }

  statement {
    sid       = "Allow access for Key Administrators"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

  statement {
    sid    = "Allow use of the key"
    effect = "Allow"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

  statement {
    sid    = "Allow attachment of persistent resources"
    effect = "Allow"
    actions = [
      "kms:CreateGrant",
      "kms:ListGrants",
      "kms:RevokeGrant"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }
}

ssm-key.tf : It’s best practice to have each type of key in its own terraform file. In this example this key is for SSM. You can create keys for any services possible in your region!

Notice the source is only pulling the ‘dev’ branch. Once tested and verified the source branch would change in each environment.

module "ssm" {
  source = "git@github.com:S-Waleed/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  policy      = data.aws_iam_policy_document.ssm_key.json

  tags = {
    Name  = "ssm"
    Owner = "wsarwari"
  }
}
# module.ssm.aws_kms_alias.this will be created
  + resource "aws_kms_alias" "this" {
      + arn            = (known after apply)
      + id             = (known after apply)
      + name           = "alias/ssm"
      + target_key_arn = (known after apply)
      + target_key_id  = (known after apply)
    }

# module.ssm.aws_kms_key.this will be created
  + resource "aws_kms_key" "this" {
      + arn                      = (known after apply)
      + customer_master_key_spec = "SYMMETRIC_DEFAULT"
      + deletion_window_in_days  = 30
      + description              = "KMS key for System Manager"
      + enable_key_rotation      = true
      + id                       = (known after apply)
      + is_enabled               = true
      + key_id                   = (known after apply)
      + key_usage                = "ENCRYPT_DECRYPT"
      + policy                   = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "kms:*"
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = "arn:aws:iam::111122223334:root"
                        }
                      + Resource  = "*"
                      + Sid       = "Enable IAM User Permissions"
                    },
                  + {
                      + Action    = "kms:*"
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow access for Key Administrators"
                    },
                  + {
                      + Action    = [
                          + "kms:ReEncrypt*",
                          + "kms:GenerateDataKey*",
                          + "kms:Encrypt",
                          + "kms:DescribeKey",
                          + "kms:Decrypt",
                        ]
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow use of the key"
                    },
                  + {
                      + Action    = [
                          + "kms:RevokeGrant",
                          + "kms:ListGrants",
                          + "kms:CreateGrant",
                        ]
                      + Condition = {
                          + Bool = {
                              + kms:GrantIsForAWSResource = "true"
                            }
                        }
                      + Effect    = "Allow"
                      + Principal = {
                          + AWS = [
                              + "arn:aws:iam::111122223334:user/waleed",
                              + "arn:aws:iam::111122223334:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor",
                              + "arn:aws:iam::111122223334:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
                            ]
                        }
                      + Resource  = "*"
                      + Sid       = "Allow attachment of persistent resources"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + tags                     = {
          + "Name"  = "ssm"
          + "Owner" = "wsarwari"
        }
    }

Plan: 2 to add, 0 to change, 0 to destroy.

After applying the code you should see the key in Key Management Service.

kms key created

Enable and disable CMKs

Simple, just update the “enabled” variable to false! Plan and apply.

module "ssm" {
  source = "git@github.com:S-Waleed/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
  enabled     = false

  ....
}

Automatic rotation

Another easy change with Terraform! By default I have set it to rotate the key in the custom module I created above. Now if you want to disable automatic rotation then all you have to do is set the variable rotation_enabled to false.

module "ssm" {
  source = "git@github.com:S-Waleed/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
  rotation_enabled     = false

  ....
}

Key Alias

Keys are identified by randomly generated GUIDs. It’s best to create in alias for each key so it can be easily identified by everyone. Aliases are also easy to create and update in Terraform. In the custom module above I have added the ability to create the key alias right after the key is provisioned. The key alias is also passed as a variable as shown below.

module "ssm" {
  source = "git@github.com:S-Waleed/terraform-aws-kms-module.git"

  description = "KMS key for System Manager"
  alias       = "ssm"
  ...
}
Type of CMKCan view CMK metadataCan manage CMKUsed only for my AWS accountAutomatic rotation
Customer managed CMKYesYesYesOptional. Every 365 days (1 year).
AWS managed CMKYesNoYesRequired. Every 1095 days (3 years).
AWS owned CMKNoNoNoVaries
This chart is from the AWS documentation

Key Policy

Each key must have a policy that allows or denies the key to be used by users or services. In the design that I have created allows you to give each key a policy. Let’s breakdown each statement of the key. 

Enable IAM User Permissions

statement {
    sid       = "Enable IAM User Permissions"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${local.account_id}:root"]
    }
  }

This statement allows all users and services in this account to execute all KMS actions on this key. It’s best practice to follow up on this open statement by creating an IAM policy to restrict the key usage. The identifiers in the principles section can be other AWS accounts.

Allow access for Key Administrators

statement {
    sid       = "Allow access for Key Administrators"
    effect    = "Allow"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

This statement allows selected individual users and IAM roles to fully manage this key. This statement must exist for every single key. Without this statement you will absolutely lose access and management of this key.

Allow use of the key

statement {
    sid    = "Allow use of the key"
    effect = "Allow"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }
  }

This statement is specifically for the usage of the key. If you do not provide the statement Enable IAM User Permissions then you must include this statement; Otherwise this key will not be usable by anyone besides the key administrators.

Allow attachment of persistent resources

statement {
    sid    = "Allow attachment of persistent resources"
    effect = "Allow"
    actions = [
      "kms:CreateGrant",
      "kms:ListGrants",
      "kms:RevokeGrant"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${local.account_id}:user/${local.admin_username}",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/support.amazonaws.com/AWSServiceRoleForSupport",
        "arn:aws:iam::${local.account_id}:role/aws-service-role/trustedadvisor.amazonaws.com/AWSServiceRoleForTrustedAdvisor"
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }

This statement allows listing, creating, and revoking grants for the key by the principals identified in the statement.

Deleting the key

Do not delete production keys! You must be 100% sure no service or data has ever used the key you are able to delete. Once a key is deleted it’s not possible to restore! You cannot decrypt data that was encrypted with the key that you delete. In Terraform you can delete the key by deleting the code or commenting out the code, then Terraform plan and apply. Or if you want to keep the code and just want to remove the resource from AWS then execute Terragrunt destroy. This destroy will also release the alias so it can be reused. 

kms key pending deletion
Key deletion


Status is now pending deletion. It will delete this key 30 days from the day of the destroy. 

Stop KMS key deletion

If you decide to not delete it then on the AWS console you can select the key then click on Key actions. Finally select Cancel key deletion. This option is only available before the deletion date. 

Cancel key deletion
Key actions

Read more about deleting KMS customer managed keys.

KMS Multi-Region Keys

Take a look at Terraform AWS KMS Multi-Region Keys.

As always if you see any errors, mistakes, have suggestions or questions please comment below. Don’t forget to like, share, and subscribe below for more! 

AWS managed CMKs

I talked about AWS managed customer master keys in my previous post here.

Complete Code

Get the complete module from https://github.com/masterwali/terraform-aws-kms-module.