Debugging disk pressure on a k8s node
A hands-on approach to diagnosing and fixing disk pressure on a kubernetes node using simple Linux commands

Cloud and DevOps professional with a passion for automation, containers, and cloud-native practices, committed to sharing lessons from the trenches while always seeking new challenges. Combining hands-on expertise with an open mind, I write to demystify the complexities of DevOps and grow alongside the tech community.
A while back, I ran into disk pressure on a node. It was a bit like opening your fridge and finding it packed, but having no idea what was taking up all the room. Our system started complaining about low disk space. I needed to figure out what was hogging all that storage before things got out of hand.
Investigation aka Tracking Down what filled the disk
I started with a simple method to see which directories were using the most space. The tool of choice was du paired with sort. Here’s the command I used:
sudo du -ahx --max-depth=1 / | sort -k1 -rh
This breaks down as follows:
sudogives me the access needed to scan every directory.dumeans disk usage. It checks files and folders to see how much space they take.-aincludes both files and directories.-hshows sizes in human-readable format, like MB or GB.-xkeeps things limited to the current filesystem.--max-depth=1only shows the top level in the directory tree, making results easier to digest./starts the scan from the root.The pipe
|takes what comes before it and passes it on.sort -k1 -rhputs the biggest results at the top, sorting by size.
My approach was straightforward: run the command at root, check which folder is biggest, then repeat inside that folder. This narrows down the possibilities fast.
The Main Offender
After just a few rounds, I found the culprit. It was the directory
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
If you use containers a lot, you might have seen this before. Containerd stores snapshots, images, and temporary files in that directory. Over time, especially with lots of deployments or heavy usage, it can quietly eat away at disk space.
What to Do Next
Once the problem is identified:
Check for unused images and containers.
Check if you need to tweak the values of imageGCHighThresholdPercent or imageGCLowThresholdPercent for Garbage collection of unused container images. See more here.
Consider setting up monitoring or alerts if this happens often.
Wrapping Up
Disk pressure feels frustrating, but a methodical approach using simple tools like du can make troubleshooting much less painful. Running this command, iterating into each large directory, and staying logical pointed me straight to the biggest space hog.
When working with containers, keep an eye on overlayfs snapshots, A hands-on approach to diagnosing and fixing disk pressure using simple Linux commandsthey build up quicker than expected. Regular checks can prevent bigger headaches down the line.




