Setting Up Terraform Backend with S3: A Step-by-Step Guide
Summary
This article provides a detailed step-by-step guide to setting up a Terraform backend with Amazon S3. The guide is aimed at those who are new to Terraform and want to learn how to use it to manage their infrastructure in the cloud. The article explains what Terraform is, the benefits of using Amazon S3 as a backend for Terraform, and provides a clear walkthrough of the steps required to set up the Terraform backend with S3.
What is Terraform?
Terraform is an open-source infrastructure as code (IaC) tool developed by HashiCorp that enables users to define and provision their infrastructure through code. Terraform uses a declarative language to describe infrastructure resources and their dependencies, allowing for the creation, modification, and destruction of resources in a safe and predictable manner. With Terraform, users can manage infrastructure resources across various cloud providers and on-premises data centers using a unified workflow. Terraform’s ability to manage infrastructure as code can greatly simplify the process of infrastructure provisioning and management, allowing for more efficient and consistent deployments.
What is Terraform State?
Terraform state is a snapshot of the resources and their configurations that Terraform manages in a particular infrastructure. It represents the current state of the infrastructure that Terraform is aware of, and it is stored locally on the Terraform user’s machine or remotely in a backend. Terraform uses the state to plan and apply changes to the infrastructure. The state is critical to Terraform’s functionality, as it helps Terraform determine the actions required to achieve the desired infrastructure configuration and ensure that the infrastructure is in the desired state.
When you run terraform apply
or terraform destroy
, Terraform creates or modifies infrastructure resources in your cloud provider based on the code you've written in your Terraform configuration files. It needs to keep track of what it has created, what changes have been made, and what needs to be updated or destroyed if changes are made to the configuration.
Terraform keeps track of all this information in a file called the “state file”. The state file is a JSON file that records the attributes and metadata of each resource that Terraform has created. It’s very important to keep this file up-to-date, as it’s the source of truth for Terraform’s knowledge of the state of your infrastructure.
By default, Terraform stores the state file locally in a file called terraform.tfstate
. This is not an ideal solution because you might work with a team or multiple machines, and having the state file locally can create conflicts between different contributors.
To solve this problem, Terraform supports “remote state” backends, which store the state file remotely on a central storage system that is accessible to all members of the team. This can be achieved by using remote backends like Amazon S3, Azure Blob Storage, Google Cloud Storage, or HashiCorp’s own Terraform Cloud.
By using remote state, you can ensure that all members of the team are working with the same version of the state file, reducing the risk of conflicts and errors. It also enables better collaboration between team members, who can easily see what changes have been made and who made them.
Let’s Begin
The initial step is to create an AWS S3 bucket to store the states of our Terraform. To do so, navigate to the “Buckets” section in the AWS Console and select it.
Then, click on “Create Bucket” to initiate the process of creating a new S3 bucket, and follow the wizard’s instructions to complete the creation of the new S3 bucket.
Ensure that the name is unique and that versioning is enabled.
Also, enable KSM encryption for the bucket before clicking on “Create Bucket”.
Once you have created the bucket, click on the newly created bucket, then select the “Permissions” tab, and proceed to edit the bucket policy.
Next, add the following bucket policy and save it:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "<your_user_arn>"
},
"Action": "s3:ListBucket",
"Resource": "<your_bucket_arn>"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "<your_user_arn>"
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "<your_bucket_arn>/*"
}
]
}
Be sure to replace <your_user_arn>
and <your_bucket_arn>
with the correct values. You can obtain your user ARN by running the command aws sts get-caller-identity
in the command line.
$ ws sts get-caller-identity
{
"UserId": "1123242342324",
"Account": "2342342432423",
"Arn": "arn:aws:iam::1123242342324:root"
}
You can find the ARN of your bucket in the properties tab of the S3 bucket.
When using an S3 bucket as a backend, collaboration is improved because multiple team members can modify the state file without causing any infrastructure or git conflicts. However, what happens when two or more people attempt to modify the state file at the same time? This is where state locking comes in. State locking prevents write operations to your state file while another write operation is ongoing. I will not delve into this in-depth, but you can read more about it in the official documentation. In our case, we will be using a DynamoDB table to lock our state.
Search for “Dynamo” in the AWS console.
Head over to the DynamoDB console and create a new table.
You can use any table name of your choice, but set the partition key to “LockID.” Leave the remaining settings as default, and then click on “Create table”.
Now that you have created an S3 bucket to store your Terraform states and a DynamoDB table for state locking, it’s time to configure Terraform to use both the bucket and the table you created to store its states.
terraform {
backend "s3" {
bucket = "mybucket"
key = "path/to/my/key"
region = "us-east-1"
dynamodb_table = "mytable"
}
}
This code is part of a Terraform configuration file and it specifies the backend configuration for storing the Terraform state in an S3 bucket. The backend in this case is the “s3” backend which tells Terraform to use Amazon S3 to store the state data.
The three parameters specified within the backend block are:
bucket
: The name of the S3 bucket where Terraform will store the state.key
: The path to the file within the bucket where Terraform will store the state.region
: The AWS Region where the S3 bucket is located.
This code tells Terraform to use an S3 bucket called “mybucket” in the “us-east-1” region to store the state file located at “path/to/my/key”. When Terraform runs, it will use this backend configuration to store and retrieve the state data.
There are many more configurations available that you can explore by referring to the list provided here.
You can now use the terraform init
command to initialize your Terraform configuration to use the S3 bucket for storing its states.
Additionally, if you prefer not to specify the S3 bucket configuration in the Terraform file, you can pass any of the configurations through the command line interface (CLI) as shown below:
terraform init -backend-config="bucket=mybucket" \
-backend-config="key=path/to/my/key" \
-backend-config="region=us-east-1"
Conclusion
In conclusion, setting up a Terraform backend with Amazon S3 provides an efficient and reliable solution for storing and managing your infrastructure code. By following the step-by-step guide we have provided, you can easily configure Terraform to use an S3 bucket for storing your state data. With this setup, you can enjoy the benefits of versioning, encryption, and easy collaboration with your team. By leveraging the power of AWS and Terraform, you can streamline your infrastructure management and deployment processes and focus on delivering value to your customers.