Why File Is Accessible After Deleting in Unix

Why file is accessible after deleting in unix?

You have basically 2 or 3 unrelated questions there. Text editors like to read the whole file into memory at the start of the editing session. Imagine every character you type being saved to disk immediately, with all characters after it in the file being rewritten one place further along to make room. That would be awful. Much better that the thing you're actually editing is a memory representation of the file (array of pointers to lines, probably with some metadata attached) which only gets converted back into a linear stream when you explicitly save.

Any relatively recent version of vim will notify you if the file you are editing is deleted from its original location with the message

E211: File "filename" no longer available

This warning is not just for unix. gvim on Windows will give it to you if you delete the file being edited. It serves as a reminder that you need to save the version you're working on before you exit, if you don't want the file to be gone.

(Note: the warning doesn't appear instantly - vim only checks for the original file's existence when you bring it back into the foreground after having switched away from it.)

So that's question 1, the behavior of text editors - there's no reason for them to keep the file open for the whole session because they aren't actually using it except at startup and during a save operation.

Question 2, why do some Windows editors keep the file open and locked - I don't know, Windows people are nuts.

Question 3, the one that's actually about unix, why do open files stay accessible after they're deleted - this is the most interesting one. The answer, guaranteed to shock you when presented directly:

There is no command, function, syscall, or any other method which actually requests deletion of a file.

Underlying rm and any other command that may appear to delete a file there is the system call unlink. And it's called unlink, not remove or deletefile or anything similar, because it doesn't remove a file. It removes a link (a.k.a. directory entry) which is an association between a file and a name in a directory. (Note: ANSI C added remove as a more generic function to appease non-unix people who had no intention of implementing unix filesystem semantics, but on unix, remove is just a rmdir if the target is a directory, and unlink for everything else.)

A file can have multiple links (see the ln command for how they are created), which means that the same file is known by multiple names. If you rm one of them, the others stick around and the file is not deleted. What happens when you remove the last link? Well, now you have a file with no name. But names are only one kind of reference to a file. There are at least 2 others: file descriptors and mmap regions. When the last reference to a file goes away, that's when the file is deleted.

Since references come in several forms, there are many kinds of events that can cause a file to be deleted. Here are some examples:

  • unlink (rm, etc.)
  • close file descriptor
    • dup2 (can implicitly closes a file descriptor before replacing it with a copy of a different file descriptor)
    • exec (can cause file descriptors to be closed via close-on-exec flag)
  • munmap (unmap memory region)
    • mmap (if you create a new memory map at an address that's already mapped, the old mapping is unmapped)
  • process death (which closes all file descriptors and unmaps all memory mappings of the process)
    • normal exit
    • fatal signal generated by the kernel (^C, segfault)
    • fatal signal sent from another process (kill)

I won't call that a complete list. And I don't encourage anyone to try to build a complete list. Just know that rm is "remove name", not "remove file", and files go away as soon as they're not in use.

If you want to destroy the contents of a file immediately, truncate it. All processes already using it will find that its size has suddenly become 0. (This is destruction as far as the normal file access methods are concerned. To destroy it more thoroughly so that even someone with raw disk access can't read what used to be there, you need to overwrite it. There's a tool called shred for that.)

What happens to an open file handle on Linux if the pointed file gets moved or deleted

If the file is moved (in the same filesystem) or renamed, then the file handle remains open and can still be used to read and write the file.

If the file is deleted, the file handle remains open and can still be used (This is not what some people expect). The file will not really be deleted until the last handle is closed.

If the file is replaced by a new file, it depends exactly how. If the file's contents are overwritten, the file handle will still be valid and access the new content. If the existing file is unlinked and a new one created with the same name or, if a new file is moved onto the existing file using rename(), it's the same as deletion (see above) - that is, the file handle will continue to refer to the original version of the file.

In general, once the file is open, the file is open, and nobody changing the directory structure can change that - they can move, rename the file, or put something else in its place, it simply remains open.

In Unix there is no delete, only unlink(), which makes sense as it doesn't necessarily delete the file - just removes the link from the directory.


If on the other hand the underlying device disappears (e.g. USB unplug) then the file handle won't be valid any more and is likely to give IO/error on any operation. You still have to close it though. This is going to be true even if the device is plugged back in, as it's not sensible to keep a file open in this case.

Cronjob not ending when deleting files

That means that the files are used by some other process. In Unix, when you delete the file, the inode and the space are not reclaimed as long as some process has still a file descriptor open on it.

You can easily find which are the process holding it using lsof /var/log

Delete file while another process still writes to it

From The Linux Programming Interface by Michael Kerrisk

In addition to maintaining a link count for each i-node, the kernel also counts open file descriptions for the file (see Figure 5-2, on page 95). If the last link to a file is removed and any processes hold open descriptors referring to the file, the file won’t actually be deleted until all of the descriptors are closed.

Your script have opened the file, thus it holds a open descriptor referring to the file. System won't delete the file just after you have removed it by i.e. use of rm command. But it will delete it after your script close the descriptor.

I have found reference in man, from man remove:

remove() deletes a name from the filesystem. It calls unlink(2) for files, and rmdir(2) for directories.

If the removed name was the last link to a file and no processes have the file open, the file is deleted and the space it was using is made available for reuse.

If the name was the last link to a file, but any processes still have the file open, the file will remain in existence until the last file descriptor referring to it is closed.

Just as @oglu mentioned in his answer, this is not portable behavior. On Windows, you may chose whether it should be possible to remove the file that you have opened.

From CreateFile function at MSDN

FILE_SHARE_DELETE (0x00000004)

Enables subsequent open operations on a file or device to request delete access.

Otherwise, other processes cannot open the file or device if they request delete access.

If this flag is not specified, but the file or device has been opened for delete access, the function fails.

Note Delete access allows both delete and rename operations.

Can a hard link ever point to a deleted file?

Consider an example,

 $ touch aFile.txt
$ ls -i aFile.txt # -i is the option to look at the inode of a file
2621520 aFile.txt

$ ln aFile.txt 2File.txt # Hardlink the file to another one
$ ls -i 2File.txt
2621520 2File.txt # inode is pointing to the same location

$ rm aFile.txt # Original file gets deleted
$ ls 2File.txt
2File.txt

$ ls -i 2File.txt # inode survives and still pointing to the same location
2621520 2File.txt

Read here more on inodes.

EDIT:
stat can show you the number of hardlinks of a file. You can use -c '%h' option to view that:

# after the hardlink to 2File.txt
$ stat -c '%h' aFile.txt
2


Related Topics



Leave a reply



Submit