DevgainsDevgainsDevgains
All articles

Kubernetes Architecture Explained: Control Plane, Nodes, and Pods

·11 min read·Updated Jul 1, 2026

Kubernetes architecture is a control loop. You declare the state you want — "run three replicas of this image, expose it on port 8080" — and a set of cooperating components works continuously to make reality match that declaration. Everything else, from self-healing to rolling updates to autoscaling, falls out of that one idea. Understand the loop and the pieces that run it, and the rest of Kubernetes stops feeling like magic. This guide is the hub for the Devgains Kubernetes cluster: it gives you the whole system in one place, then points to the deep dives for the parts you'll touch most.

Most people meet Kubernetes as a pile of YAML and a kubectl command that either works or prints a wall of red. That framing hides the design. Underneath, Kubernetes is two planes — a control plane that decides what should run, and a set of worker nodes that actually run it — connected by a single source of truth and a handful of reconciliation loops. Once you can name the components and trace a request through them, debugging and capacity planning become concrete instead of superstitious.

Quick answer: what is the architecture of Kubernetes?

Kubernetes architecture has two layers:

  1. The control plane — the cluster's brain. It holds the desired state, makes scheduling decisions, and runs the reconciliation loops. Its main components are the API server, etcd, the scheduler, and the controller manager.
  2. The worker nodes — the machines that run your workloads. Each node runs a kubelet (the node agent), a kube-proxy (networking), and a container runtime such as containerd.

The smallest thing you deploy is a pod — one or more containers that share a network address and storage. You almost never create pods directly; higher-level objects like a Deployment create and manage them for you. The API server is the only component that talks to etcd; everything else talks to the API server.

Why it matters

You cannot operate what you cannot picture. When a pod is stuck in Pending, the answer is almost always "the scheduler couldn't place it" — but you only know to look there if you know the scheduler exists and what it does. When a rollout stalls, it's a controller waiting on readiness. When the whole cluster goes read-only, etcd is usually the story. The architecture is the troubleshooting map.

It also explains the properties teams actually buy Kubernetes for. Self-healing isn't a feature someone coded per-app; it's the reconciliation loop noticing a pod died and creating a replacement. Zero-downtime releases aren't luck; they're a controller adding ready pods before removing old ones. Because container orchestration, cloud infrastructure, and the observability tooling around it are among the highest-value topics in developer advertising, they're also some of the best-documented — nearly every component below has an authoritative reference.

Architecture: the two planes

Every Kubernetes cluster is the same shape, whether it's a laptop kind cluster or a 5,000-node fleet:

                 ┌──────────────── Control Plane ────────────────┐
   kubectl ─────▶│  API server ─▶ etcd (state)                    │
                 │      ▲          ▲                               │
                 │      │          │                               │
                 │  scheduler   controller-manager                │
                 └──────┬────────────────────────────────────────┘
                        │ (API server is the only front door)
        ┌───────────────┼───────────────┐
        ▼               ▼               ▼
   ┌─────────┐     ┌─────────┐     ┌─────────┐   Worker Nodes
   │ kubelet │     │ kubelet │     │ kubelet │
   │ runtime │     │ runtime │     │ runtime │
   │  Pods   │     │  Pods   │     │  Pods   │
   └─────────┘     └─────────┘     └─────────┘

Control plane components:

  • API server (kube-apiserver) — the front door. Every read and write goes through its REST API, which validates requests, enforces auth, and persists state. It is the only component that reads or writes etcd, which keeps the data model consistent.
  • etcd — a distributed key-value store holding the entire cluster state: every object, its spec, and its status. It's the single source of truth; lose it and you lose the cluster's memory.
  • Scheduler (kube-scheduler) — watches for pods with no assigned node and picks one, filtering on resource requests, taints/tolerations, affinity rules, and available capacity.
  • Controller manager (kube-controller-manager) — runs the reconciliation controllers (Deployment, ReplicaSet, Node, Job, and more). Each is a loop comparing desired vs. actual and acting to close the gap.

Worker node components:

  • kubelet — the agent on every node. It watches the API server for pods assigned to its node and tells the container runtime to start them, then reports status back.
  • kube-proxy — programs the node's networking rules so a stable Service IP routes to the right pod IPs.
  • Container runtimecontainerd or CRI-O — the software that actually pulls images and runs containers.

The key insight: nothing pushes work onto a node. Components watch the API server and react. The scheduler writes "pod X belongs on node 3" to the API server; node 3's kubelet is watching, sees the assignment, and starts the container. This watch-and-reconcile design is why Kubernetes is resilient — there's no fragile chain of direct commands to break.

Step-by-step: what happens when you run kubectl apply

Trace a single deploy through the architecture and every component earns its place. Start with a minimal Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: web
          image: ghcr.io/acme/web:1a2b3c4
          ports: [{ containerPort: 8080 }]
          resources:
            requests: { cpu: "100m", memory: "128Mi" }

Apply it and watch the loop turn:

kubectl apply -f deployment.yaml
# deployment.apps/web created
 
kubectl get pods -w        # watch pods appear and go Ready
kubectl rollout status deployment/web

Here is the chain of events, component by component:

  1. kubectl → API server. Your YAML is sent to the API server, which authenticates you, validates the object against its schema, and writes a Deployment into etcd.
  2. Deployment controller reconciles. It's watching for Deployments. Seeing one that wants 3 replicas with 0 existing, it creates a ReplicaSet (again, just a write to the API server).
  3. ReplicaSet controller reconciles. It wants 3 pods, sees 0, and creates 3 Pod objects — each with no node assigned yet.
  4. Scheduler assigns nodes. It sees 3 unscheduled pods, filters nodes by the pod's resource requests and constraints, scores the survivors, and writes a node name onto each pod.
  5. kubelet starts containers. On each chosen node, the kubelet sees a pod assigned to it, asks the container runtime to pull the image and start the container, then reports the pod's status back to the API server.
  6. The loop keeps running. If a pod later crashes, its status stops matching desired state, a controller notices, and a replacement is created — no human involved.

That's the entire system in one request: declare → persist → reconcile → schedule → run → observe → reconcile again. For the deploy-side mechanics of doing this safely in production, see the guide on deploying Docker containers to production.

Control plane components at a glance

ComponentRuns onJobFails if…
API serverControl planeValidates & serves all reads/writes; the only client of etcdCluster becomes unreachable; no changes possible
etcdControl planeStores all cluster state (source of truth)Cluster loses its memory; often read-only
SchedulerControl planePlaces unscheduled pods onto nodesNew pods stay Pending
Controller managerControl planeRuns reconciliation loops (Deployment, ReplicaSet, Node…)Desired state stops being enforced
kubeletEvery nodeStarts/monitors pods via the runtime; reports statusThat node's pods aren't managed
kube-proxyEvery nodePrograms Service networking rulesService traffic to that node breaks
Container runtimeEvery nodePulls images and runs containersNo containers start on that node

Best practices

  • Set resource requests on every pod. The scheduler places pods using requests. Omit them and it can't reason about capacity, so it overpacks nodes and workloads get evicted.
  • Never depend on a specific pod IP. Pods are cattle — they come and go with new IPs. Talk to a Service, whose stable virtual IP kube-proxy keeps pointed at healthy pods.
  • Protect etcd. Back it up, run it with an odd number of members (3 or 5) for quorum, and give it fast disks. Its health is the cluster's health.
  • Use higher-level objects, not bare pods. A pod you create by hand is never recreated if its node dies. A Deployment's ReplicaSet controller recreates it. Let controllers own pods — and pick the right one: see Deployment vs StatefulSet vs DaemonSet.
  • Make readiness probes honest. The control plane only routes traffic to pods a readiness probe says are ready — get this wrong and rollouts serve errors. See liveness vs. readiness probes.

Common mistakes

  • Treating pods as durable. Pods are disposable by design. Store state in volumes or external systems, not on the pod's local disk.
  • Ignoring Pending pods. Pending almost always means the scheduler can't place the pod — usually insufficient CPU/memory or a taint. kubectl describe pod shows the reason.
  • One giant node vs. right-sized many. A single huge node removes Kubernetes' ability to reschedule around failure. Spread capacity across several nodes.
  • Editing pods instead of their controller. Change the Deployment; the ReplicaSet rolls the change out. Editing a live pod is overwritten on the next reconcile.
  • Skipping resource limits. Requests drive scheduling; limits stop one noisy pod from starving its neighbors. Set both.

Takeaways

  • Kubernetes architecture is a control loop: declare desired state, and reconciliation controllers work continuously to make actual state match.
  • Two planes: a control plane (API server, etcd, scheduler, controller manager) that decides what runs, and worker nodes (kubelet, kube-proxy, runtime) that run it.
  • The API server is the only front door, and etcd is the single source of truth — every other component watches the API server and reacts.
  • Pods are the unit of deployment, but you manage them through controllers like Deployments, which is what gives you self-healing and safe rollouts.
  • The architecture is your debugging map: Pending → scheduler, stalled rollout → controller/readiness, cluster-wide read-only → etcd.

Ready to go deeper? Browse the Kubernetes cluster and the related DevOps guides — how to roll out new versions with zero downtime, configure liveness and readiness probes, and build lean images with multi-stage Docker builds that pull fast onto every node.

FAQ

What is Kubernetes? Kubernetes is an open-source container orchestrator that runs and manages containerized applications across a cluster of machines. You declare the desired state — which images to run, how many replicas, how to expose them — and Kubernetes continuously reconciles the cluster to match it, handling scheduling, self-healing, and rolling updates.

Why use Kubernetes? Because it turns operational patterns you'd otherwise script by hand — restart on failure, scale on load, roll out without downtime, reschedule when a node dies — into declarative, built-in behavior. You describe what you want once, and the control loop maintains it.

How does Kubernetes work? It runs a reconciliation loop. The control plane stores your desired state in etcd, the scheduler places pods onto nodes, and controllers compare desired vs. actual state and act to close the gap. On each node, the kubelet starts the assigned containers and reports status back to the API server, which keeps the loop informed.

What is the difference between the control plane and worker nodes? The control plane is the cluster's brain — it decides what should run and where (API server, etcd, scheduler, controller manager). Worker nodes are the muscle — they actually run your containers via the kubelet and container runtime. In managed offerings (EKS, GKE, AKS) the cloud runs the control plane for you.

What is a pod in Kubernetes? A pod is the smallest deployable unit: one or more containers that share a network namespace (one IP) and storage. Containers in a pod are always scheduled together on the same node. You typically create pods indirectly through a Deployment rather than by hand.

When should you use Kubernetes? When you run multiple services that need scaling, self-healing, and controlled rollouts across more than one machine. For a single small app on one server, a simpler runtime is often enough — Kubernetes' power comes with real operational overhead.

Conclusion

Kubernetes stops being intimidating the moment you see it as one idea repeated: a control loop reconciling desired state against reality. The control plane remembers what you asked for and decides how to achieve it; the nodes carry it out; and the API server sits in the middle as the single front door and source of truth. Every advanced feature — autoscaling, rolling updates, self-healing — is that same loop wearing a different hat. Keep this map in your head and the next Pending pod or stalled rollout becomes a question with an obvious place to look. From here, follow the deep dives above to turn the mental model into production-grade operations.

References

11 min read

Read next