Terraform Explained: State, Providers, and the Plan/Apply Workflow

Terraform is the tool that lets you describe your cloud infrastructure as code and have it created, changed, and destroyed on demand — no clicking through a console, no drift between what you think is running and what actually is. This guide is Terraform explained from the ground up for developers: what problem it solves, how its three core ideas — state, providers, and the plan/apply workflow — fit together, and how to run it without blowing up production. It is the pillar page for the Devgains cloud cluster, and it pairs naturally with how you deploy containers to production and run them on Kubernetes.

Quick answer: what is Terraform?

Terraform is an open-source Infrastructure as Code (IaC) tool from HashiCorp that provisions and manages infrastructure through declarative configuration files. You write what you want — a virtual machine, a database, a DNS record — in a language called HCL (HashiCorp Configuration Language), and Terraform figures out how to make reality match, calling the underlying cloud APIs for you.

Three concepts do all the work:

Providers — plugins that teach Terraform how to talk to a platform (AWS, Azure, GCP, Kubernetes, Cloudflare, GitHub, and hundreds more).
State — a JSON file that records what Terraform has already created, mapping your config to real resource IDs.
The plan/apply workflow — Terraform compares your desired config against state, shows you the exact diff (plan), then executes only that diff (apply).

The one-line mental model: Terraform is a diff engine for infrastructure. You declare the end state; it calculates and applies the difference.

Why Terraform matters

Before IaC, infrastructure lived in people's heads and in consoles. Someone clicked a database into existence at 2am during an incident, nobody wrote it down, and six months later no one could reproduce the environment. Terraform makes infrastructure versioned, reviewable, and repeatable:

Version control. Your infrastructure lives in Git next to your app. Every change is a diff, reviewed in a pull request like any other code.
Reproducibility. The same config spins up identical dev, staging, and prod environments. "It works on my cluster" stops being a mystery.
Multi-cloud with one workflow. The same plan/apply loop provisions AWS, Azure, a Kubernetes cluster, and a Cloudflare DNS record — you learn one tool, not five consoles.
Auditability. State plus Git history tells you what exists and when it changed.

This is the same instinct behind putting your CI pipeline in code and your deployments in a repeatable pipeline: if it isn't in version control, it doesn't really exist.

How Terraform works: the architecture

Terraform's core is a loop between three things — your configuration, the state file, and the real world (the cloud provider's API).

Configuration (desired state). Your .tf files declare the resources you want.
State (known state). terraform.tfstate records what Terraform believes it has already created, including real resource IDs and attributes.
Refresh + plan. Terraform reads the current real-world state via provider APIs, compares it to your config, and produces a plan: the set of create/update/delete actions needed to reconcile the two.
Apply. Terraform executes the plan through providers, then writes the new reality back into state.

This is a reconciliation loop, the same idea that powers the Kubernetes control plane: declare the target, let the tool converge on it. The difference is that Terraform runs the loop on-demand (when you run apply) rather than continuously.

Providers are the plugins that make this real. Each provider wraps a platform's API and exposes it as Terraform resources (aws_instance, azurerm_resource_group, kubernetes_deployment) and data sources (read-only lookups). Terraform downloads the providers your config needs into .terraform/ on terraform init.

Step-by-step: your first Terraform workflow

Here is the full loop on a minimal example. First, declare a provider and a resource. This config creates an Azure resource group — the simplest thing to provision:

# main.tf
terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}
 
provider "azurerm" {
  features {}
}
 
resource "azurerm_resource_group" "app" {
  name     = "devgains-prod-rg"
  location = "westeurope"
  tags = {
    environment = "production"
    managed_by  = "terraform"
  }
}

Now run the workflow. Each command maps to one stage of the loop above:

# 1. Download the azurerm provider and set up the working dir.
terraform init
 
# 2. Show the diff: what will change, without touching anything.
terraform plan
 
# 3. Apply the diff after you approve it.
terraform apply
 
# ...later, tear it all down.
terraform destroy

terraform plan prints a color-coded diff — + to create, ~ to change in place, - to destroy, and -/+ to replace (destroy then recreate). Read this diff every time. It is the single most important safety feature Terraform gives you: a -/+ on your production database is a warning you do not want to skip. Once you approve, apply calls the Azure API, creates the resource group, and records its ID in terraform.tfstate.

Managing state safely: use a remote backend

The default is a local terraform.tfstate file — fine for a solo tutorial, dangerous for a team. Two people running apply against separate local state files will corrupt each other's infrastructure. The fix is a remote backend with state locking, so state is shared and only one apply can run at a time:

# backend.tf — store state in Azure Blob Storage with automatic locking.
terraform {
  backend "azurerm" {
    resource_group_name  = "devgains-tfstate-rg"
    storage_account_name = "devgainstfstate"
    container_name       = "tfstate"
    key                  = "prod.terraform.tfstate"
  }
}

With a remote backend, state lives in one place (Azure Blob, an S3 bucket + DynamoDB lock table, or Terraform Cloud), every team member and CI job reads the same truth, and a lock prevents concurrent applies from racing. This is the first thing to set up on any real project.

Declarative vs imperative: how Terraform compares

Terraform isn't the only way to manage infrastructure. Here's where it sits:

Approach	Model	You specify	Drift handling	Example
Terraform	Declarative	The end state	Detected via `plan` against state	`resource "aws_instance"`
Shell / CLI scripts	Imperative	Each step, in order	None — you script it yourself	`az vm create ...`
Ansible	Mostly imperative (procedural)	Tasks to run	Idempotent modules, no state file	`- name: create VM`
CloudFormation / ARM/Bicep	Declarative	The end state	Managed by the cloud, single-cloud	AWS/Azure-native templates

Terraform's edge is being declarative and cloud-agnostic: one language and one workflow across providers, with an explicit state file that makes drift visible. The trade-off is that state file — it's power and responsibility, which is why the mistakes below almost all trace back to it.

Best practices

Always run plan before apply. Read the diff. In CI, run plan on the pull request and apply only after a human approves the merge.
Use a remote backend with locking from day one. Local state on a shared project is a data-loss incident waiting to happen.
Pin provider and module versions. Use ~> constraints and commit the .terraform.lock.hcl lock file so every machine resolves identical provider versions — the same discipline as a language lockfile.
Never edit infrastructure by hand. Manual console changes create drift that the next apply will try to undo. If Terraform manages it, change it only through Terraform.
Keep secrets out of state and config. State stores resource attributes in plaintext. Pull secrets from a vault/key manager at apply time and encrypt the backend at rest.
Modularize. Wrap repeated patterns (a "web service", a "database") in reusable modules with input variables, so environments differ only by their inputs.

Common mistakes

Committing terraform.tfstate to Git. It can contain secrets and will cause merge conflicts that corrupt state. Use a remote backend and .gitignore local state.
Ignoring the plan output. Blindly typing yes is how a one-line tag change turns into a -/+ that recreates your database. The diff told you; you didn't read it.
Editing state by hand. Hand-editing terraform.tfstate is almost never right. Use terraform state mv/rm/import instead.
No locking. Two concurrent applies with no lock will interleave writes and corrupt state.
Giant monolithic state. One state file for the whole company means every change locks everything and blast radius is huge. Split state by environment and by bounded context.
Using terraform destroy casually. In shared environments, destroy is irreversible. Guard it with prevent_destroy lifecycle rules on critical resources.

Key takeaways

Terraform is a declarative diff engine for infrastructure: you declare the end state, it reconciles reality to match.
Providers talk to platforms, state records what exists, and plan/apply is the loop that turns config into infrastructure.
The plan diff is your safety net — read it, especially -/+ (replace) lines.
Use a remote backend with locking on any team project; never commit local state.
Pin versions, avoid manual changes (drift), and keep secrets out of state.

FAQ

How does Terraform work? Terraform reads your declarative .tf configuration, compares it against a state file that records what it has already created, refreshes the real-world status through provider APIs, and produces a plan of create/update/delete actions. When you approve, it applies that plan by calling the cloud APIs and writes the result back to state.

What is Terraform state and why does it matter? State is a JSON file mapping your configuration to real resource IDs. It's how Terraform knows a given resource block already corresponds to an existing VM, so it can update rather than recreate it. Without state, Terraform couldn't tell "create new" from "modify existing."

What is the difference between terraform plan and terraform apply? plan is a dry run: it computes and shows the diff without changing anything. apply executes that diff against your infrastructure. Always plan first, then apply.

Is Terraform better than Ansible or CloudFormation? They solve overlapping but different problems. Terraform is declarative and cloud-agnostic with an explicit state file, ideal for provisioning infrastructure. Ansible excels at configuring servers. CloudFormation/Bicep are cloud-native alternatives locked to AWS/Azure. Many teams use Terraform to provision and Ansible to configure.

Is Terraform free? The Terraform CLI is open source and free. HashiCorp also sells Terraform Cloud/Enterprise for teams (remote state, policy, run management). Note that in 2023 Terraform moved to the BSL license, which prompted the community OpenTofu fork — a drop-in, MPL-licensed alternative.

Conclusion

Terraform turns infrastructure into something you can review, version, and reproduce — the same leverage version control gave application code. Once the three ideas click — providers talk to platforms, state remembers what exists, and plan/apply reconciles the two — the rest is detail. Set up a remote backend, read every plan, keep changes in Git, and your infrastructure stops being a fragile artifact of who clicked what and becomes just more code. From here the cloud cluster goes deeper into Terraform modules, managing multiple environments, and CI/CD for infrastructure. Browse the full cloud category and the broader DevOps guides to continue.

References

Terraform Documentation — language, CLI, and workflow reference.
Terraform: State — how state works and why it exists.
Terraform: Backends — remote state and locking.
Terraform Registry — official and community providers and modules.
OpenTofu — the open-source, community-governed Terraform fork.