Terraform

What is it?

Terraform is a cloud-agnostic IaC (Private) solution.

Terraform is split into two parts:

  • One part is the Terraform engine, which understands...
    • how to read state from a provider
    • read HCL code
    • how to get from the state your infrastructure is currently into the state you want your infrastructure to be in.
  • The other part is the provider, which talks to the infrastructure to find out the current state and make changes using the infrastructure’s API.

Provisioning Workflow

There are 3 main CLI commands that involve creating, modifying and destroying infrastructure: plan, apply and destroy.

Terraform CLI

Terminology

Provider

A provider is a connection that allows Terraform to manage infrastructure using an interface (e.g. AWS API)

Resource

A resource represents a piece of real world infrastructure

Examples:

  • an S3 bucket
  • an EKS (Elastic Kubernetes) cluster
  • a Postgres role

Module

A Terraform module is a set of Terraform configuration files in a single directory that can be considered its own standalone Terraform project.

  • it can therefore...
    • contain its own resources, data sources, locals, etc.
    • take variables (ie. inputs on a per-module basis)

Data source

A data source is used to fetch data from a resource that is not managed by the current Terraform project.

  • think of it as a read-only resource that already exists

Local

A local is Terraform's representation of a variable.

  • note: not to be confused with Terraform variables.

Variable

A variable is set at runtime, allowing us to vary Terraform's behaviour.

  • Therefore, if Terraform were a function, a variable would be an input to the function.
  • note: not to be confused with locals, which themselves are actually more like variables as used in general programming
variable "bucket_name" {
    type = string
    # describe what this variable is used for
    description = "the name of the bucket we are creating"
    default = "default_bucket_name"
}

resource "aws_s3_bucket" "bucket" {
    bucket = var.bucket_name
}

Variables can be more complex too:

instance_map = {
    dev = "t3.small"
    test = "t3.medium"
    prod = "t3.large"
}

environment_type = "dev"

And referenced like:

variable "instance_map" {
    type = map(string)
}
variable "environment_type" {}

output "selected_instance" {
    value = var.instance_map[var.environment_type]
}

Types

  • string
  • bool
  • number
  • list(<TYPE>)
  • set(<TYPE>)
    • each value is unique
  • map(<TYPE>)
  • object()
    • like a map, but values can be different types
  • tuple([<TYPE>, …])
    • number of values and order is preserved
  • any
    • unlike any type from Typescript; this any allows Terraform to infer based on the actual value.

Providing variables (4 ways)

  1. When we run terraform init and terraform apply, we will be prompted to provide a value for the variable(s).

  2. pass the value with:

terraform apply -var bucket_name=my_bucket
  1. export environment variables in the terminal prefixed with TF_VAR_:
export TF_VAR_bucket_name=my_bucket
  1. create a terraform.tfvars file (or <ANYNAME>.auto.tfvars):
bucket_name = "my_bucket"

State

State is the place where Terraform stores of all of the resources (and their metadata) it has created.

  • run terraform state list to see all resources existing in state.

This state is used by Terraform to work out how changes need to be made.

State is stored in terraform.tfstate

If we want to move resource creation from one project to another, state needs to be manipulated directly

  • this can be handled by (example uses a AWS VPC resource)

    1. running terraform state rm aws_vpc.my_vpc command, which will remove the resource from state (so Terraform is no longer managing it), but will not delete the resource in the cloud.
    2. in the new project, copy+paste over the resource and run terraform import aws_vpc.my_pc <VPC_ID>
    3. run terraform apply
  • some resources do not support import. In this case, use terraform state mv

Remote state

Multiple people working on the same Terraform project can introduce a lot of complexity, since a local state file is used to store a record of what has been created. If we run terraform commands on a second machine, it will try to create double the resources.

  • to get around this issue, we can store state in a remote location (e.g. in an S3 bucket)

We specify the remote state location using the backend keyword. Here we are using an S3 bucket:

# state.tf
backend "s3" {
    bucket = "<bucket-name>"
    key = "my-project.state"
    region = "us-west-1"
}

The remote state backend needs to support "locking", which prevents changes to the state while Terraform commands are running.

A good idea is to use S3 bucket versioning so we can time travel through different Terraform states.

Workspaces

Workspaces solve the problem "how do we create multiple environments using the same code?"

terraform.workspace is a special variable that resolves to the current workspace we are running in.

Unless we explicitly specify, we are running in the default workspace.

Local workspaces are stored in terraform.tfstate.d/

  • each workspace has its own state

CLI

  • List workspaces - terraform workspaces list
  • Create new workspace - terraform workspace new development
  • Switch workspaces - terraform workspace select development

Terraform Cloud

Terraform cloud provides us with a method to change our input variables at the top level, meaning each set of infra (for each environment) can have its own set of variables.

With it, we:

  1. create a workspace
  2. point it at a source control repo containing your Terraform code
  3. set the variables for that workspace

Lifecycle

Each resource has a special attribute block called lifecycle that gives us extra control.

It allows us to:

  • create_before_destroy, to ensure a new resource is created prior to deleting the old one
  • prevent_destroy, to prevent Terraform from ever deleting the resource, so long as the property exists
lifecycle {
  prevent_destroy = true
}

Provisioner

Provisioners allow us to run a script (remotely or locally) after a resource has been created.

  • provisioners allow us to step in and solve problems ourselves when they are not solved out of the box by the provider we are using.
  • because provisioners are imperative, they are seen as a last resort approach to solving our problem.

Misc

Multi-line string

Multi-line strings are declared between <<ANYWORD and ANYWORD:

resource "aws_iam_policy" "my_bucket_policy" {
    name = "my-bucket-policy"

    policy = <<POLICY
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": [
                    "s3:ListBucket"
                ],
                "Effect": "Allow",
                "Resource": [
                    "${data.aws_s3_bucket.bucket.arn}"
                ]
            }
        ]
    }
    POLICY
}

String interpolation (${interpolated_value}) can be used inside a multi-line string.

  • only needed when inside quotes ("")

Outputting to console (stdout)

output "message" {
    value = aws_s3_bucket.my_bucket.id
}

or we can print all attributes exported by a resource:

output "all" {
    aws_s3_bucket.my_bucket
}

Folder structure

All Terraform files should be in a single directory (the Terraform project) at the top level. Any files within subdirectories will be ignored. Conceptually, when we run Terraform commands, everything will be appended into a single file anyway.

  • child directories are used to set up Modules

By convention,

  • set up providers in main.tf.
  • resources named after their type (e.g. sqs.tf, api-gateway.tf)
  • variables in variables.tf

Tools

  • Atlantis - Pull Request automation for Terraform
    • purpose is to have improved code review for infra changes.
  • Terratest - a unit testing framework for Terraform

Alternatives

  • Chef/Puppet - these are configuration management tools. They are designed to configure and manage the already existing infrastructure, while Terraform is designed to set up the infrastructre itself.
    • In other words, Puppet and Chef would be used to configure servers, while Terraform would be used to create the server itself.
  • Pulumi - This IaC tool uses a programming language (like Typescript) instead of a configuration language.

Children
  1. Data Source
  2. Module
  3. Provider
  4. Resource
  5. State
  6. Terraform CLI
  7. Terragrunt
  8. Variables

Backlinks