The big six categories
- Model caches — Hugging Face, llama.cpp, Ollama. Easily 30–80 GB on a researcher's machine.
- Docker overlay layers — accumulating from `docker pull`, `docker build`, and stopped-but-not-removed containers. 20–60 GB common.
- Build artifacts — node_modules, .next, .nuxt, .gradle, target/, build/, dist/. 10–40 GB across projects.
- Dependency stores — conda envs, pyenv installs, nvm versions, Cargo registry. 10–30 GB.
- IDE caches — VSCode, JetBrains, Xcode DerivedData. 5–25 GB.
- System and app caches — what traditional cleaners find. 5–15 GB.
The first five are invisible to traditional cleaners like CleanMyMac. They're invisible because they look like user data — they live in dotfiles inside the home directory, not in known cache locations.
Hugging Face caches in detail
Hugging Face stores model weights in `~/.cache/huggingface/`. A single 7B-parameter model is roughly 14 GB; a 70B model is 140+ GB. Researchers who try multiple models accumulate dozens of these.
Cleanup pattern: identify models you haven't loaded in 30+ days. The Hugging Face library doesn't expose this directly; you have to check file access timestamps. DreamCleanr does this automatically and shows model-by-model size + last-access time.
Docker overlay layers in detail
Docker stores image layers in `/var/lib/docker/overlay2/`. The directory grows monotonically with `docker pull` and `docker build` commands. `docker system prune` removes some of this — dangling images and stopped containers — but doesn't touch images you haven't used in months.
Better cleanup: identify images by last-pulled time and image name patterns. Old test images, old base images you've moved past, ephemeral debugging containers — all reclaimable. DreamCleanr's container intelligence module surfaces these by age.
Build artifacts and node_modules
Every JavaScript project has a node_modules directory that's typically 200 MB to 2 GB. A developer with 20 projects accumulates 4–40 GB just in node_modules. Most projects haven't been touched in 90+ days but their node_modules persist.
Cleanup pattern: scan project directories for last git commit timestamp. Projects untouched for 90+ days are candidates for deleting node_modules — easy to recreate via `npm install` if you ever return to the project.
Dependency store accumulation
Python developers accumulate conda envs, pyenv versions, and pipx installs that pile up over time. Node developers accumulate nvm versions. Rust developers accumulate Cargo registry caches. Each tool has a 'one true location' for storage and most of it is reclaimable.
The right cleanup order
- Hugging Face / Ollama caches — biggest single wins
- Docker images older than 90 days
- node_modules in projects untouched 90+ days
- Old conda envs / pyenv versions you don't use
- IDE caches (DerivedData, JetBrains caches)
- System caches that traditional cleaners handle
Following this order, a typical AI developer's Mac reclaims 50–150 GB in the first sweep. Subsequent sweeps maintain the recovered headroom.
Where DreamCleanr fits
The patterns above are what DreamCleanr automates. Pre-built scan profiles for AI Developer / Web Developer / Data Scientist surface findings in priority order with last-access timestamps. Cleanup is reversible (30-day staging) so the cost of an aggressive scan is bounded.
Frequently asked questions
- Why doesn't `du -sh ~/.cache` find all of this?
- It does for things in ~/.cache. But Hugging Face uses ~/.cache/huggingface/, Docker uses /var/lib/docker/, conda uses ~/anaconda3 or ~/miniconda3, build artifacts live in project directories. They're scattered across the filesystem deliberately by each tool.
- Is `docker system prune` enough for Docker cleanup?
- It handles dangling images and stopped containers. It does NOT touch images you haven't pulled recently but that aren't dangling. For a developer who pulls many images over time, the un-pruned old images are often the largest reclaimable category.
- How risky is deleting node_modules?
- Zero risk if the project has a package.json — you can recreate node_modules with `npm install`. The 'risk' is needing internet access and a few minutes when you next return to the project.
- Will cleanup break my running development environment?
- Not if you follow the right order and use a tool that respects active work. DreamCleanr's safe and balanced modes deliberately skip anything that's currently in use; max mode is more aggressive and warns before touching active artifacts.
- How often should I run cleanup?
- Monthly is the right cadence for most AI developers. Weekly if you're heavily experimenting with models. Quarterly for less active projects.