The 2 AM Crisis: When df -h Hits 99%
My phone rang incessantly. I jolted awake, bleary-eyed, to check the system as the production server started throwing mass 500 errors. SSH access was sluggish. When I typed df -h, the /var/lib/docker partition was glowing red at 99%. The culprit wasn’t the database or application logs, but Docker itself.
After several months of continuous CI/CD, hundreds of old images and “orphan” volumes had silently swallowed 150GB of my disk space. If you’re in this situation, let’s “spring clean” your system safely and professionally.
Why is Docker So Storage-Hungry?
Docker tends to keep everything unless you explicitly tell it to delete. Every time you run docker build, new layers are created. If a build fails or you update to a new version, old layers become dangling images (images with no name or tag). They linger like ghosts, occupying storage.
Additionally, stopped (Exited) containers still take up space to store their state. The most dangerous culprits are Volumes. When you delete a container, the volume containing the actual data doesn’t disappear; it just accumulates over the years.
Step 1: Locating the “Storage Thief”
Don’t just start deleting blindly. Check how many resources Docker is using with this command:
docker system df
Actual results from a server I once handled:
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 24 5 12.5GB 10.2GB (81%)
Containers 15 2 1.2GB 1.1GB (91%)
Local Volumes 40 10 25GB 15GB (60%)
Build Cache 102 0 5.4GB 5.4GB
Look at the RECLAIMABLE column; you’ll see how much space you can “take back.” In the example above, I could immediately free up over 30GB of junk.
Step 2: Clearing Stopped Containers
Many people have the habit of running test containers and then leaving them. Even when not running, these containers still take up disk space. To remove all inactive containers, use:
docker container prune
If you want to be more cautious, only delete containers stopped for more than 24 hours to avoid affecting temporarily paused tasks:
docker container prune --filter "until=24h"
Step 3: Handling Redundant Docker Images
Images are usually the biggest disk space hogs. You need to distinguish between two types:
- Dangling images: Untagged images, usually created by overwriting an old version during a build.
- Unused images: Tagged images that are not being used by any container.
Quickly delete dangling images with:
docker image prune
To clean up all unused images (including images you just pulled but haven’t run yet), use the -a flag. In my experience on production, I always use a time filter for safety:
docker image prune -a --filter "until=72h"
Step 4: Dealing with the Volume “Black Hole”
Volumes contain important data like databases or uploaded files. Docker is extremely cautious and will never delete a volume automatically. Volumes not attached to any container are called dangling volumes.
List orphan volumes before making a decision:docker volume ls -f dangling=true
If you determine that data is no longer valuable, clean it up:
docker volume prune
Warning: Always back up your data first. Accidentally deleting a database volume is an irreversible disaster.
The “Heavy Duty” Command: Docker System Prune
Want to clean everything up in one go? Docker provides the following “destruction” command:
docker system prune
This command will wipe out stopped containers, unused networks, and dangling images. However, it does not delete volumes by default. For a thorough cleanup of both volumes and old images, use:
docker system prune -a --volumes
In Staging environments, I usually run this command weekly. The results are often impressive; I once reclaimed 40% of disk space in just 5 seconds.
Workflow for Keeping Your Server “Clean”
Don’t wait for the server to hit the red zone before cleaning up. Apply these 4 golden rules:
- Use the
--rmFlag: When running temporary containers for testing or debugging, add--rmso Docker automatically deletes the container upon exit.docker run --rm alpine echo "Automatic cleanup" - Integrate into CI/CD: Add a
docker image prunestep to the end of your deployment script to remove old images as soon as the new version goes live. - Configure Log Rotation: Docker logs can swell to dozens of GBs. Limit log size in
/etc/docker/daemon.json:"log-driver": "json-file", "log-opts": {"max-size": "10m", "max-file": "3"} - Set up a Cronjob: Run
docker system pruneperiodically on weekends for servers where uptime requirements aren’t extremely strict.
I hope these hands-on experiences help you manage Docker more effectively. Wishing you a good night’s sleep without being woken up by disk space alerts!
