How to really make use of `kubectl debug`

27 Jan 2026 Filed in: Technical 748 words

So I had ==CockroachDB== deployed on my kind cluster. So the thing is its deployed as a statefulset. I don’t really know the internal working of CockroachDB.

I was going through Community Forum of CockroachDB, and I was going through Client Connection Issues, the different things that needs to tested/checked.

❯ kg svc
NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)              AGE
my-release-cockroachdb          ClusterIP   None           <none>        26257/TCP,8080/TCP   4h44m
my-release-cockroachdb-public   ClusterIP   10.96.85.228   <none>        26257/TCP,8080/TCP   4h44m

Infact, cockroachDB is exposed as a headless service, (ClusterIP = None).

Pods identify each other using FQDN (e.g., cockroachdb-0.cockroachdb.default.svc.cluster.local). If DNS or the Headless Service is misconfigured, they can’t see each other, the cluster never “initializes,” and the health check returns 503.

# Discovery : Check if one pod can see the other pod
kubectl exec -it cockroachdb-0 -- nslookup cockroachdb-1.cockroachdb

;; Got recursion not available from 10.96.0.10
Server:		10.96.0.10
Address:	10.96.0.10#53

Name:	my-release-cockroachdb-1.my-release-cockroachdb.ckdb.svc.cluster.local
Address: 10.244.1.14
;; Got recursion not available from 10.96.0.10

If the above works but still get i/o timeout: You likely have a NetworkPolicy or Firewall blocking port 26257 (internal gossip) or 8080 (HTTP health).

Standard database images like CockroachDB are often “distroless” or heavily hardened, meaning they lack a shell (sh, bash) and package managers (apt, yum) to reduce the attack surface.

Run this command to drop a “Netshoot” container (the Swiss Army knife of networking) into any failing pod:

k debug -it my-release-cockroachdb-0 \
  --image=nicolaka/netshoot \
  --target=db \
  --profile=general \
  -- sh

Once inside that shell, try these three tests:

DNS Test: Check if it can find its neighbor. nslookup my-release-cockroachdb-1.my-release-cockroachdb
Port Test: See if the neighbor is listening. nc -zv my-release-cockroachdb-1.my-release-cockroachdb 26257
Local Check: See what CockroachDB is doing on the local ports. ss -tulpn

When you set clusterIP: None, the FQDN my-svc.namespace.svc.cluster.local resolves to a list of all the individual Pod IPs currently backing that service.

In case of clusterIP: 10.96.1.12 you get a stable FQDN like my-svc.namespace.svc.cluster.local. This resolves to a single Virtual IP (the ClusterIP).

CheatSheet

Ephemeral container debugging
- API call made: PATCH /pods/<name>/ephemeralcontainers
- No PodSpec mutation
- Use when you need need same namespaces (network, PID)
- You can access the file-system using of target container : /proc/<PID>/root
```
k debug -it my-release-cockroachdb-0 \
--image=nicolaka/netshoot \
--target=db \
--profile=general \
-- sh
```

Pod-copy debugging

Copies: Volumes, Env vars, Security context

kubectl debug pod/<pod-name> \
--copy-to=<new-pod-name> \
--set-image=*=busybox \
--share-processes \
-it

Node debugging
Use when : kubelet is alive, Node networking/storage broken, CNI/CSI/kube-proxy issues.
Under the hood creates a Pod where
- hostPID: true
- hostNetwork: true
- / mounted at /host
```
kubectl debug node/<node-name> \
-it \
--image=busybox
```

Pod State	Ephemeral container	Pod copy	Node debug	Why
Running + Ready	✅ Best	⚠️ Overkill	❌	Everything already works
Running + NotReady	✅	⚠️	❌	Debug probes / app health
CrashLoopBackOff	⚠️ Sometimes	✅ Best	❌	Container restarts too fast
OOMKilled	❌ Mostly useless	✅ Required	❌	Container already dead
ImagePullBackOff	❌	❌	❌	Pod never runs
Pending	❌	❌	⚠️	Scheduling problem
InitContainerCrashLoop	❌	✅	❌	Ephemeral containers attach only after init
Completed (Succeeded)	⚠️	✅	❌	App already exited
Evicted	❌	❌	⚠️	Pod gone
PodDeleted	❌	❌	⚠️	No object to patch

Common Error Encountered

i/o timeout: This means that the internal CockroachDB process is trying to reach other pods and failing.
503 Error: This is the kubelet telling you: “I tried to ping the CockroachDB health endpoint (usually /admin/v1/health), but the database told me it’s not ready to take traffic.”

How to really make use of `kubectl debug`

CheatSheet

Common Error Encountered

Related Posts