Skip to content

How to manage Terraform state?

How to manage Terraform state? And by the way, what is a TFState file? How does it make Terraform code different from other configuration management tools and what are the best practices around it?

How to manage Terraform state

This article is an transcript from a video interview series : Ask Me Anything on Infrastructure as Code with the Author of “Infrastructure as Code – Cook book”

How to manage Terraform state? First of all, what exactly is a TFState?

Knowing how to manage Terraform state is key. The TFState file in Terraform is what makes it very different from other systems. You can spin and launch infrastructures with other configuration management tools like Chef, Saltstack and Ansible, but the biggest difference with Terraform relies on this state.

You can see your TFState file as a big JSON structure of the reality of your infrastructure working together with the Terraform code in which you declare the so called “desired state” you want to achieve.

This desired state is declarative, which means that when you declare within your code that you want a specific resource with a specific configuration and when you apply this code, Terraform will “talk” to your Cloud provider’s API, and then spawn all those resources. Once it’s done, it will write the reality of the deployment on the cloud provider’s side on this JSON file.

At the end of the day, you have 3 situations with 3 parties:

  • your code
  • the reality on your cloud provider’s account
  • the state file. 

The state file is exactly the mirror of the last successful apply of your code. It works with any kind of resources, which means that you can do this with your cloud provider if you’re interested by launching infrastructure resources, but you can do this as well with any kind of providers.

Let’s say that you deploy helm charts using Terraform. If you do so, you will have a state file of your deployments as well. It’s the same with github, if you use the github provider. If you define users, and the same user is used for a project on GCP and an IAM on AWS as a github user, it’s going to be dumped for this specific user, all linked on the Terraform state file.

Consider it as the reference for the reality of the existing infrastructure which you can refer to, from inside your code.

So basically, a state file is the reality of your deployment on your cloud provider from the intention declared on your Terraform code.

How do you deal with it and where do you store your Terraform State?

By default there is no option. The first time you initialize your git repo with Terraform code and apply some small Terraform code, it is going to create the TFState directly on the root of your github repo. You need to check that in somewhere. By default, you can be tempted to push that to github, which is what a lot of people do, but it is not one of the best practices around here because it can contain secrets and you probably do not want to push secrets to github.

But still, you need to store your state file, because if you don’t store it, you’re going to lose what Terraform considers as the “existing infrastructure”. So if you drop it, and you run an apply again, it’s going to try to apply again all the infrastructure as “new” and you probably you don’t want to do this.

The best practice is to share it somewhere, usually on a s3 bucket or any kind of storage bucket on AWS, Azure, GCP…

resource "aws_instance" "vm" {
  ami                    = data.aws_ami.amazon-linux.id
  instance_type          = var.instance_type
  tags = {
    Name = "DEMO VM DESTROY ME - ${terraform.workspace}",
    Terraform = "true"
  }
}

data "aws_ami" "amazon-linux" {
  most_recent = true
  owners = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn-ami-hvm-*"]
  }

} 

So basically you declare your back end. In this case it’s a very simple code that deploys basically a single VM. It’s for demonstration purposes. I use this to play and test my terraform workspaces. So don’t take it for a major work.

So you can configure Terraform using the Terraform keyword, and you can say : “for Terraform, I want my back-end to be S3, and the bucket for S3 needs to be this one.”

And basically you say where you want you state file to be. So it’s as simple as that. And at the next Terraform apply, Terraform will use a temporary state file locally and then upload it on your S3 bucket.

And then, each time you want to work on it, it’s going to use this one.

terraform {
  backend "s3" {
    bucket = "cs-tfstates-demo-sj-frankfurt-1"
    key    = "tfstates/terraform.tfstate"
  }
} 

Last tip, don’t forget to add as well, a lock file on a database. That way, two people can’t start a destructive action or an apply on terraform at the same time if this lock exists.

Terraform automation for growing teams

Speed up deployment cycles | Reduce mistakes | Empower your team

About us

CloudSkiff is an Infrastructure as code platform that provides Terraform automation and collaboration. We help growing teams safely ship infrastructure in short cycles and make their code better.