Terraform Explained: State, Providers, and the Plan/Apply Workflow
Terraform is the tool that lets you describe your cloud infrastructure as code and have it created, changed, and destroyed on demand — no clicking through a console, no drift between what you think is running and what actually is. This guide is Terraform explained from the ground up for developers: what problem it solves, how its three core ideas — state, providers, and the plan/apply workflow — fit together, and how to run it without blowing up production. It is the pillar page for the Devgains cloud cluster, and it pairs naturally with how you deploy containers to production and run them on Kubernetes.
Quick answer: what is Terraform?
Terraform is an open-source Infrastructure as Code (IaC) tool from HashiCorp that provisions and manages infrastructure through declarative configuration files. You write what you want — a virtual machine, a database, a DNS record — in a language called HCL (HashiCorp Configuration Language), and Terraform figures out how to make reality match, calling the underlying cloud APIs for you.
Three concepts do all the work:
- Providers — plugins that teach Terraform how to talk to a platform (AWS, Azure, GCP, Kubernetes, Cloudflare, GitHub, and hundreds more).
- State — a JSON file that records what Terraform has already created, mapping your config to real resource IDs.
- The plan/apply workflow — Terraform compares your desired config against state, shows you the
exact diff (
plan), then executes only that diff (apply).
The one-line mental model: Terraform is a diff engine for infrastructure. You declare the end state; it calculates and applies the difference.
Why Terraform matters
Before IaC, infrastructure lived in people's heads and in consoles. Someone clicked a database into existence at 2am during an incident, nobody wrote it down, and six months later no one could reproduce the environment. Terraform makes infrastructure versioned, reviewable, and repeatable:
- Version control. Your infrastructure lives in Git next to your app. Every change is a diff, reviewed in a pull request like any other code.
- Reproducibility. The same config spins up identical dev, staging, and prod environments. "It works on my cluster" stops being a mystery.
- Multi-cloud with one workflow. The same
plan/applyloop provisions AWS, Azure, a Kubernetes cluster, and a Cloudflare DNS record — you learn one tool, not five consoles. - Auditability. State plus Git history tells you what exists and when it changed.
This is the same instinct behind putting your CI pipeline in code and your deployments in a repeatable pipeline: if it isn't in version control, it doesn't really exist.
How Terraform works: the architecture
Terraform's core is a loop between three things — your configuration, the state file, and the real world (the cloud provider's API).
- Configuration (desired state). Your
.tffiles declare the resources you want. - State (known state).
terraform.tfstaterecords what Terraform believes it has already created, including real resource IDs and attributes. - Refresh + plan. Terraform reads the current real-world state via provider APIs, compares it to your config, and produces a plan: the set of create/update/delete actions needed to reconcile the two.
- Apply. Terraform executes the plan through providers, then writes the new reality back into state.
This is a reconciliation loop, the same idea that powers the
Kubernetes control plane:
declare the target, let the tool converge on it. The difference is that Terraform runs the loop
on-demand (when you run apply) rather than continuously.
Providers are the plugins that make this real. Each provider wraps a platform's API and exposes it
as Terraform resources (aws_instance, azurerm_resource_group, kubernetes_deployment) and
data sources (read-only lookups). Terraform downloads the providers your config needs into
.terraform/ on terraform init.
Step-by-step: your first Terraform workflow
Here is the full loop on a minimal example. First, declare a provider and a resource. This config creates an Azure resource group — the simplest thing to provision:
# main.tf
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "app" {
name = "devgains-prod-rg"
location = "westeurope"
tags = {
environment = "production"
managed_by = "terraform"
}
}Now run the workflow. Each command maps to one stage of the loop above:
# 1. Download the azurerm provider and set up the working dir.
terraform init
# 2. Show the diff: what will change, without touching anything.
terraform plan
# 3. Apply the diff after you approve it.
terraform apply
# ...later, tear it all down.
terraform destroyterraform plan prints a color-coded diff — + to create, ~ to change in place, - to destroy,
and -/+ to replace (destroy then recreate). Read this diff every time. It is the single most
important safety feature Terraform gives you: a -/+ on your production database is a warning you do
not want to skip. Once you approve, apply calls the Azure API, creates the resource group, and
records its ID in terraform.tfstate.
Managing state safely: use a remote backend
The default is a local terraform.tfstate file — fine for a solo tutorial, dangerous for a
team. Two people running apply against separate local state files will corrupt each other's
infrastructure. The fix is a remote backend with state locking, so state is shared and only
one apply can run at a time:
# backend.tf — store state in Azure Blob Storage with automatic locking.
terraform {
backend "azurerm" {
resource_group_name = "devgains-tfstate-rg"
storage_account_name = "devgainstfstate"
container_name = "tfstate"
key = "prod.terraform.tfstate"
}
}With a remote backend, state lives in one place (Azure Blob, an S3 bucket + DynamoDB lock table, or Terraform Cloud), every team member and CI job reads the same truth, and a lock prevents concurrent applies from racing. This is the first thing to set up on any real project.
Declarative vs imperative: how Terraform compares
Terraform isn't the only way to manage infrastructure. Here's where it sits:
| Approach | Model | You specify | Drift handling | Example |
|---|---|---|---|---|
| Terraform | Declarative | The end state | Detected via plan against state | resource "aws_instance" |
| Shell / CLI scripts | Imperative | Each step, in order | None — you script it yourself | az vm create ... |
| Ansible | Mostly imperative (procedural) | Tasks to run | Idempotent modules, no state file | - name: create VM |
| CloudFormation / ARM/Bicep | Declarative | The end state | Managed by the cloud, single-cloud | AWS/Azure-native templates |
Terraform's edge is being declarative and cloud-agnostic: one language and one workflow across providers, with an explicit state file that makes drift visible. The trade-off is that state file — it's power and responsibility, which is why the mistakes below almost all trace back to it.
Best practices
- Always run
planbeforeapply. Read the diff. In CI, runplanon the pull request andapplyonly after a human approves the merge. - Use a remote backend with locking from day one. Local state on a shared project is a data-loss incident waiting to happen.
- Pin provider and module versions. Use
~>constraints and commit the.terraform.lock.hcllock file so every machine resolves identical provider versions — the same discipline as a language lockfile. - Never edit infrastructure by hand. Manual console changes create drift that the next
applywill try to undo. If Terraform manages it, change it only through Terraform. - Keep secrets out of state and config. State stores resource attributes in plaintext. Pull secrets from a vault/key manager at apply time and encrypt the backend at rest.
- Modularize. Wrap repeated patterns (a "web service", a "database") in reusable modules with input variables, so environments differ only by their inputs.
Common mistakes
- Committing
terraform.tfstateto Git. It can contain secrets and will cause merge conflicts that corrupt state. Use a remote backend and.gitignorelocal state. - Ignoring the plan output. Blindly typing
yesis how a one-line tag change turns into a-/+that recreates your database. The diff told you; you didn't read it. - Editing state by hand. Hand-editing
terraform.tfstateis almost never right. Useterraform state mv/rm/importinstead. - No locking. Two concurrent applies with no lock will interleave writes and corrupt state.
- Giant monolithic state. One state file for the whole company means every change locks everything and blast radius is huge. Split state by environment and by bounded context.
- Using
terraform destroycasually. In shared environments, destroy is irreversible. Guard it withprevent_destroylifecycle rules on critical resources.
Key takeaways
- Terraform is a declarative diff engine for infrastructure: you declare the end state, it reconciles reality to match.
- Providers talk to platforms, state records what exists, and plan/apply is the loop that turns config into infrastructure.
- The
plandiff is your safety net — read it, especially-/+(replace) lines. - Use a remote backend with locking on any team project; never commit local state.
- Pin versions, avoid manual changes (drift), and keep secrets out of state.
FAQ
How does Terraform work?
Terraform reads your declarative .tf configuration, compares it against a state file that records
what it has already created, refreshes the real-world status through provider APIs, and produces a
plan of create/update/delete actions. When you approve, it applies that plan by calling the cloud
APIs and writes the result back to state.
What is Terraform state and why does it matter?
State is a JSON file mapping your configuration to real resource IDs. It's how Terraform knows a
given resource block already corresponds to an existing VM, so it can update rather than recreate
it. Without state, Terraform couldn't tell "create new" from "modify existing."
What is the difference between terraform plan and terraform apply?
plan is a dry run: it computes and shows the diff without changing anything. apply executes that
diff against your infrastructure. Always plan first, then apply.
Is Terraform better than Ansible or CloudFormation? They solve overlapping but different problems. Terraform is declarative and cloud-agnostic with an explicit state file, ideal for provisioning infrastructure. Ansible excels at configuring servers. CloudFormation/Bicep are cloud-native alternatives locked to AWS/Azure. Many teams use Terraform to provision and Ansible to configure.
Is Terraform free? The Terraform CLI is open source and free. HashiCorp also sells Terraform Cloud/Enterprise for teams (remote state, policy, run management). Note that in 2023 Terraform moved to the BSL license, which prompted the community OpenTofu fork — a drop-in, MPL-licensed alternative.
Conclusion
Terraform turns infrastructure into something you can review, version, and reproduce — the same leverage version control gave application code. Once the three ideas click — providers talk to platforms, state remembers what exists, and plan/apply reconciles the two — the rest is detail. Set up a remote backend, read every plan, keep changes in Git, and your infrastructure stops being a fragile artifact of who clicked what and becomes just more code. From here the cloud cluster goes deeper into Terraform modules, managing multiple environments, and CI/CD for infrastructure. Browse the full cloud category and the broader DevOps guides to continue.
References
- Terraform Documentation — language, CLI, and workflow reference.
- Terraform: State — how state works and why it exists.
- Terraform: Backends — remote state and locking.
- Terraform Registry — official and community providers and modules.
- OpenTofu — the open-source, community-governed Terraform fork.

