ReplicaSet ≠ High Availability (Until You Test This)
Pods fail, nodes go down: this walkthrough shows what actually happens, and how to fix it.
30 second summary for you:
Running your app in Kubernetes doesn’t mean it’s highly available. This walkthrough shows how ReplicaSets restore failed pods, handle node loss, and work with probes, all backed by real command-line examples. Adapted from Packt’s The Kubernetes Bible (Chapter 10).
8-minute read
Hands-on commands included
The Problem: One Dead Pod, and Your App Stalls
Let’s say you’ve got a stateless NGINX app deployed in a multi-node Kubernetes cluster using a ReplicaSet. You think you’re covered because there are 4 replicas. But then you:
delete a pod manually
drain one of the nodes
simulate a container failure
In all three cases, you’re expecting automatic recovery. But it’s not magic. It's ReplicaSet (and sometimes liveness probes) doing the heavy lifting.
Let’s walk through all three failure modes and see what Kubernetes does.
Pod Deletion? No Problem.
This scenario demonstrates how a ReplicaSet restores deleted pods to maintain the desired number of replicas.
Here's a step-by-step walkthrough:
Define the ReplicaSet manifest: Save the following YAML as
nginx-replicaset-example.yaml:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaset-example
namespace: rs-ns
spec:
replicas: 4
selector:
matchLabels:
app: nginx
environment: test
template:
metadata:
labels:
app: nginx
environment: test
spec:
containers:
- name: nginx
image: nginx:1.17
ports:
- containerPort: 80
Create the namespace: This ensures all your resources are scoped properly.
kubectl create -f ns-rs.yaml
Deploy the ReplicaSet: The manifest defines a ReplicaSet with 4 NGINX pods.
kubectl apply -f nginx-replicaset-example.yaml
Delete a pod manually: Simulate a pod failure by deleting one of the running pods.
kubectl delete pod <pod-name> -n rs-ns
Verify that the ReplicaSet restores the pod: The controller detects the change and automatically spins up a new pod to maintain the desired count.
kubectl get pods -n rs-ns
kubectl describe rs/nginx-replicaset-example -n rs-ns
Within seconds, the ReplicaSet controller notices the missing pod and recreates it to meet the declared replica count.
Takeaway: ReplicaSets automatically maintain the number of desired pods, making recovery from manual deletions fast and hands-free.
2. Node Failure? Here's What Actually Happens
This scenario demonstrates how ReplicaSets maintain high availability when a node goes down by rescheduling pods onto available nodes:
Here's a step-by-step walkthrough:
Expose your app with a Service:
kubectl apply -f nginx-service.yaml
This creates a service to access your app across pods.
Forward traffic from your local machine to the Kubernetes Service:
kubectl port-forward svc/nginx-service 8080:80 -n rs-ns
curl localhost:8080
This confirms your service is working and traffic is flowing to the pods.
Check where the pods are currently running:
kubectl get pods -n rs-ns -o wide
This shows which node each pod is scheduled on.
Simulate node failure by cordoning and draining the node:
kubectl cordon kind-worker
Prevents new pods from being scheduled on this node.
kubectl drain kind-worker --ignore-daemonsets
Evicts all running pods from the node while ignoring daemonsets.
kubectl delete node kind-worker
Removes the node from the cluster to simulate a full node failure.
Within moments, the ReplicaSet detects the missing pods and spins up new ones on the remaining healthy nodes. Your Service automatically reroutes traffic to these new pods.
Verify that everything is still working:
kubectl get pods -n rs-ns -o wide
curl localhost:8080
You’ll see that traffic still flows, and the app remains accessible without downtime.
Takeaway: The ReplicaSet ensures that the desired number of pod replicas is always maintained — even when a node goes offline. It handles pod rescheduling automatically, as long as there's sufficient capacity in your cluster.
3. Unhealthy Container? Probes Save the Day
Let’s see how Kubernetes handles an unhealthy container using liveness probes.
Here's a step-by-step walkthrough:
Add the following liveness probe to your ReplicaSet pod spec. It instructs the kubelet to check container health after 2 seconds and repeat every 2 seconds:
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2
Apply your updated ReplicaSet manifest and wait for the pod to be up and running.
Simulate a container failure by deleting the default NGINX index file:
kubectl exec -it <pod-name> -- rm /usr/share/nginx/html/index.html
Check what happens by describing the pod:
kubectl describe pod <pod-name>
You’ll see Liveness probe failed events, followed by automatic container restarts.
Takeaway: The kubelet, not the ReplicaSet, manages container health. But when used with ReplicaSets, probes help create a resilient system that self-heals when a container goes bad.
Cleanup
You can delete the ReplicaSet and its pods:
kubectl delete rs/nginx-replicaset-livenessprobe-example
Or just delete the controller, leaving pods untouched:
kubectl delete rs/nginx-replicaset-livenessprobe-example --cascade=orphan
Key Takeaways
ReplicaSets guarantee pod replication and replacement—not health checking
Liveness probes enable kubelet to restart broken containers
Node failure recovery works if your cluster has enough capacity and replicas are spread
HA = ReplicaSets + Probes + Services, working in tandem
Based on Chapter 10 of The Kubernetes Bible, Second Edition