Finding Hidden Space Hogs: Identifying Deleted But Open Files And Other Disk Usage Oddities

Locating Open but Deleted Files

Open deleted files refer to files that have been deleted from disk but are still held open by running processes. Although deleted, storage space cannot be freed up until all open file handles are properly closed. Identifying these open deleted files can help troubleshoot apparent disk space exhaustion issues.

Using lsof to find open deleted files

The lsof command lists detailed information about files opened by processes. Appending the -n flag ensures output shows actual file path instead of replacing deleted path with DEVNULL. Grepping for DELETE then reveals files that are opened by processes but have been deleted from disk.

$ lsof -n | grep DELETE

This will output the process ID (PID) along with the file descriptor and deleted file path still held open by that process. The process maintaining an open file handle can then be acted upon to release the deleted file.

Listing process IDs with open deleted files

To narrow down just the PIDs of processes keeping deleted files open, we pipe the lsof output through awk.

$ lsof -n | grep DELETE | awk '{print $2}'

The second field output by lsof contains the PID. Grep filters to just deleted files, then awk prints the PID column. Any PIDs shown still have open handles to now deleted files consuming space.

Getting file paths still held open

If interested specifically in the paths of deleted files still open instead of the associated PIDs, a slight tweak to the awk command will extract that information.

$ lsof -n | grep DELETE | awk '{print $9}' 

This pulls out the 9th column in lsof output, which contains the file path held open by a process even if deleted from disk already.

Identifying Large Hidden Temporary Files

Many applications create temporary working files that can grow very large, yet fail to clean them up properly afterwards. These temporary files build up over time consuming ever more disk space.

Finding files created by applications

The best place to hunt down unexpectedly large temporary files written by apps is in /tmp. This directory contains volatile working files meant to be deleted, but some have managed to escape cleanup.

$ du -sh /tmp/*

The du command will measure space usage of files and directories in /tmp, identifying any large temporary files created by applications.

Temp files from web browsers and office apps

Common productivity software often utilitizes the user’s /tmp directory for temporarily storing working files, but failing to remove them later. Firefox caches and Office autosaves build up over time for example.

$ rm -rf /tmp/firefox* 
$ rm -rf ~/Library/Application\ Support/Microsoft/Office/Preferences/Autosave Files/*

Explicitly deleting the temporary files related to Firefox web browser caches and LibreOffice autosaved documents clears up space consumed by aging unneeded temporary application files. Adjust paths appropriately for browser and office suite in use.

Cleaning up unneeded temp files

For a more automated, encompassing approach to clearing temporary file build up, dedicated system cleaner applications target /tmp contents. BleachBit and CCleaner sweep away application working files forgotten in /tmp.

$ bleachbit
$ ccleaner 

As an added bonus they cover several other temporary storage locations prone to uncontrolled disk usage growth over time from apps mismanaging transient working data.

Revealing Big Log Files

Log files provide crucial troubleshooting information, but can swell immensely in size over time. Periodic inspection and clean up prevents log files from consuming more storage than intended due to verbosity or lack of rotation.

Finding verbose app and system logs

The main logging directories are /var/log for system logs and ~/Library/Logs for per-user application logging on macOS. Looking at file sizes identifies particularly volumious logs.

$ ls -lhS /var/log
$ ls -lhS ~/Library/Logs

The -lhS options list files by size with human readable formatting, sorted by largest first. Any exceptionally large log files bubble to the top for inspection. Web server, database, and system utility logging tend to be the most extreme offenders.

Rotating and cleaning log files

Log rotation automatically archives logs into dated files, keeping current logs small and manageable while still preserving history. Log cleaning takes this a step further by deleting older logs.

$ logrotate -f /etc/logrotate.conf
$ ccze -A 30 ~/Library/Logs

Configure logrotate rules in /etc/logrotate.conf, then force a rotation with -f flag. For application logs, install ccze log colorizer and pipe to it to strip logs older than 30 days here.

Log file management tips

Preventing logs from excessive growth comes down to three main techniques:

  • Log Rotation – Regularly archive log contents and start fresh logs
  • Log Cleaning – Automatically delete older archive logs after X days
  • Rate Limiting – Configure verbose services to less aggressively log info

Taking a tiered approach combining all three helps keep log files from hogging space.

Inspecting Mounted File Systems

Mounted drives, network shares, and virtual disk images can be overlooked sources of utilized disk space. Periodically inspecting mounts identifies any unnecessary storage locations still attached to the system.

Listing all mounted file systems

The findmnt command comprehensively discovers and displays information on all mounted file systemsacross media types on Linux.

$ findmnt
$ findmnt -D

Passing -D provides filesystem usage statistics summary for each mount point as well.

Spotting old mounts that are no longer needed

Looking through the full list of mounted filesystems, old network storage shares, attached drives, and disk image files may stand out as no longer necessary. The consistent need for these mounts should be evaluated.

$ mount | grep /media/backup_drive1
/dev/sdb2 on /media/backup_drive1 type ext4 (rw,relatime)

Any mounts that cannot be justified retaining mounted should be considered for unmounting to recover capacity. Temporary NAS mounts and leftover attached storage tend to be common for reclaiming capacity.

Unmounting unused file systems

File systems identified as unnecessarry that are currently mounted can manually be unmounted using the umount command.

$ umount /media/backup_drive1

Providing the directory of the mount point as the argument instantly disconnects that file system, releasing any consumed storage capacity.

Pinpointing Large Cached Files

File system caches retain commonly accessed files and data in memory for improved performance. But these caches can swell immensely in size over time potentially consuming substantial storage capacity.

Viewing cached files for applications

File caches are located under the /var/folders directory on macOS and can be inspected for unexpected growth or specific application caches.

$ sudo du -sh /var/folders/* 
Password:

The sudo du command measures size of all folders, enabling identification of overgrown caches. Specific apps like browsers may have dedicated caches worth targeting under var/folders.

Cleaning browser and app caches

Directly deleting cache files clears up space consumed by stale cached application data. Firefox profile folders containing cache data provide a prime target.

$ rm -rf ~/Library/Caches/Firefox/Profiles/*

Adjust cache directories appropriately for the browser, apps, and distros in use. Clearing cache forces repopulation with only fresh, relevant data on next usage.

Managing the system file cache

The unified buffer/page cache for filesystem data can also capture substantial storage capacity for performance benefit. Tuning is necessary to balance responsiveness versus usage.

$ sysctl -w vm.drop_caches=3
$ echo 3 | sudo tee /proc/sys/vm/drop_caches

Dropping the cache frees all consumed space, but with performance impact until rebuilt. Adjust cache parameters in sysctl.conf or /proc for automated flushing when capacities exceed thresholds.

Leave a Reply

Your email address will not be published. Required fields are marked *