Locating Deleted Files Still Open In /Proc On Linux

Recovering Accidentally Deleted Open Files

When a file is deleted in Linux while still open by a process, the data is not immediately removed from disk. The filesystem unlinks the file, taking the directory entry for the file out of its parent directory, but as long as the file remains open, the data remains on disk in the inode. This allows the process still using the file to access the the file normally and continue reading from and writing to it. This leads to an opportunity to recover accidentally deleted files that are still in use before they are fully removed by locating them through the open file handles.

Searching the /proc Filesystem

The /proc filesystem contains virtual files and directories that dynamically expose information about the system, including attributes of running processes. This ability to peer into process internals allows an administrator to inspect file handles opened by running processes and determine if any match recently deleted files that have not been fully unlinked yet.

Scanning /proc/{pid}/fd

Each running process on a Linux system has a directory under /proc named /proc/{pid} containing details on the resources in use by that specific process. One of those virtual files is /proc/{pid}/fd which lists all open file descriptors for the process numbered sequentially. By scanning through /proc for deleted files still held open by these file descriptors, recovery candidates can be found.

For example, listing the /proc/{pid}/fd directory of a running bash shell could output:

lr-x------ 1 user user 64 Feb 29 08:12 0 -> /dev/pts/2
l-wx------ 1 user user 64 Feb 29 08:12 1 -> /tmp/test.txt
lr-x------ 1 user user 64 Feb 29 08:12 2 -> /dev/pts/2

Here file descriptor 1 matches the /tmp/test.txt file, even if test.txt has been deleted and unlinked from the /tmp directory.

Understanding File Descriptors

File descriptors are unique integer handles assigned to each open file in a running process. They allow the process to access the file or other resource through system calls like read() and write() by passing the descriptor number. This links the descriptor to the specific underlying file or pipe for the duration of the opening process. Deleting the filesystem path to an open file does not close this descriptor and unlink, leaving the data still recoverable.

Matching Descriptors to Deleted Files

By iterating through all processes and inspecting their open file handles, matches can be made to recently deleted files that are still present in the inode table. Unlinked but open files can often be identified by creation time, original path, or size depending on the specific filesystem. This allows recovery by copying data from the deleted but open file before the last descriptor is closed and the data is truncated.

Tracing Open File Handles with lsof

Manually inspecting the /proc filesystem for every running process is time consuming and error prone. A better solution is using the lsof utility to aggregate and filter information on all open files across every process on the system.

Listing All Open Files with lsof

The lsof command lists detailed information on open files matching specified criteria. Running lsof with no arguments will output all open files across all active processes. This full listing can be filtered to narrow down candidates for recovery of deleted but still open files.

# lsof
COMMAND     PID     USER   FD      TYPE    DEVICE   SIZE/OFF       NODE NAME
bash      1208     user    1w      REG    253,1    157120 34532523573 /tmp/test.txt (deleted)

Filtering Output for Deleted Files

Adding filters to the lsof command allows fining candidates for file recovery quickly. Specifically, the +D flag restricts output to only deleted but still open files with existing inode entries. Combined with options like -c to filter by process name, administrators can zone in on specific deleted files.

  
# lsof +D -c bash
bash      1208     user    1w      REG    253,1    157120 34532523573 /tmp/test.txt (deleted)

Identifying Process Owners

For each open but deleted file, lsof includes both the source process and owner. This assists in tracking down which application or user still requires access to the unlinked file to prevent premature data loss on close. The process name can identify the associated application, while the user id points to which user context opened the file originally.

Using fuser to Locate Open File Handles

The fuser utility provides a complement to lsof for locating file handles opened by active processes. While lsof filters by process, fuser filters based on mountpoints and inodes, listing all processes with an open handle for a particular filesystem resource.

fuser Basics

Invoking fuser with the mountpoint or inode number of a deleted but open file will output all processes still accessing the associated data. This allows identifying consumers that may require the file to remain available at the inode level prior to permanent deletion.

# fuser -m /mnt
/mnt:         788c

Finding Who Has a File Open

Running fuser against a mount point or inode shows each process id (pid) that is holding open a file handle from that resource or one of its unlinked descendants. This maps open descriptors back to running processes so they can be inspected or sent signals to close file handles if recovery is not necessary.

Identifying Deleted Files in Use

Combining fuser output with information on recently deleted files allows tying unlinked inodes back to the original filename and path. Known file sizes and process owners can also help confirm truly deleted files versus false positives. This assists recovering only necessary data from unlinked but open files before the underlying inode is erased.

Recovering Data from Unlinked Open Files

Once a deleted but still open file has been reliably identified through /proc, lsof, or fuser, the data can be copied out or redirected before the last file descriptor is closed and the kernel erases the inode. Care must be taken to avoid data loss by maintaining the open file handles throughout the process.

Using cp to Retrieve Data

The most straightforward way to recover an unlinked but open file is by using the cp utility. By passing cp the file descriptor path for the target file rather than the deleted filename, the full data can be copied out without disturbing existing open handles.

# cp /proc/3462/fdinfo/4 recovered_file

Dumping File Contents with dd

The dd tool provides another option for data recovery by reading full or partial file contents through a provided file descriptor. Like cp, this does not close the handle to the unlinked file allowing continued access during the dump operation.

# dd if=/proc/3462/fd/4 of=recovered_data.bin

Closing File Handles Carefully

Once data has been recovered from an open but deleted file, handles may still need to remain open for certain processes to continue operating normally. Closing descriptors should be phased across process owners, starting with copies like those made by cp and working towards handles opened directly by applications only once recovery is complete.

Avoiding Unintended File Deletions

While recovering data from accidentally deleted open files is possible in Linux, prevention is still the best approach. Techniques like trash cans, backups, and user education help avoid situations requiring undeleting rescued files from /proc, lsof, or fuser output.

Enabling Trash Can Functionality

Instead of immediately unlinking files, Linux desktop environments can be configured to use a trash can model on file deletion. This creates a holding area for mistakenly deleted data to be easily recovered without resorting to searching raw /proc handles.

Setting up Automated Backups

Regular file backups provide redundancy for restoring important data following accidental deletion. Combined with snapshotting capabilities, robust backup schemes minimize risk and recovery effort around unintended permanent file removal.

Exercising Caution When Deleting

User training around carefully validating files before deletion can reduce mistakes leading to undeleting open files after realization of permanent removal. Verifying trash can and backup systems are working properly improves overall data hygiene as well.

Leave a Reply

Your email address will not be published. Required fields are marked *