Relinking an Anonymous (Unlinked But Open) File

Relinking an anonymous (unlinked but open) file

A patch for a proposed Linux flink() system call was submitted several years ago, but when Linus stated "there is no way in HELL we can do this securely without major other incursions", that pretty much ended the debate on whether to add this.

Update: As of Linux 3.11, it is now possible to create a file with no directory entry using open() with the new O_TMPFILE flag, and link it into the filesystem once it is fully formed using linkat() on /proc/self/fd/fd with the AT_SYMLINK_FOLLOW flag.

The following example is provided on the open() manual page:

    char path[PATH_MAX];
fd = open("/path/to/dir", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);

/* File I/O on 'fd'... */

snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file", AT_SYMLINK_FOLLOW);

Note that linkat() will not allow open files to be re-attached after the last link is removed with unlink().

Perl: Undo unlink with file open

These answers say: 'No, not in general, and definitely not on all Unices':

Relinking an anonymous (unlinked but open) file

https://serverfault.com/questions/168909/relinking-a-deleted-file

Copying the content from the open filehandle may work.

Is it possible to create an unlinked file on a selected filesystem?

The ability to do so is OS-specific, since the relevant POSIX function calls all result in a link being generated. Linux in particular has allowed, since version 3.11, the use of O_TMPFILE in the flags argument of open(2) in order to create an anonymous file in a given directory.

What Exactly are Anonymous Files

My C is very rusty, so hopefully more experienced people can correct me, but I think the answer to your question "What exactly is this anonymous file? Does it exist on disk, or only in memory?" is "It exists on disk".

Here is what happens at C level (I'm looking at the source code at http://cran.r-project.org/src/base/R-3/R-3.0.2.tar.gz):

A. Function file_open, defined in src/main/connections.c:554, has the following logic related to anonymous file (with an empty description), lines 565-568:

if(strlen(con->description) == 0) {
temp = TRUE;
name = R_tmpnam("Rf", R_TempDir);
} else name = R_ExpandFileName(con->description);

So a new temporary filename is generated if no file name was supplied to file.

B. If the name of the file is not equal to stdin, the call R_fopen(name, con->mode) happens at line 585 (there some subtleties with Win32 and UTF8 names, but we can ignore them now).

C. Finally, the file name is unlinked at line 607. The documentation for unlink says:

The unlink() function removes the link named by path from its
directory and decrements the link count of the file which was
referenced by the link. If that decrement
reduces the link count of the file to zero, and no process has the file open, then all resources associated with the file are
reclaimed. If one or more process have the
file open when the last link is removed, the link is removed, but the removal of the file is delayed until all references to it have
been closed.

So in effect the directory entry is removed but file exists as long as it's being open by R process.

D. Finally, R_fopen is defined in src/main/sysutils.c:135 and just calls fopen internally.

Is there a way to completely remove an inode when the Link count is 2?

No, there isn't anything that does what you want out of the box.

It might be useful to do the deletion when unlinking the hardlink and noticing that the link count is 1, since at that point the inode should be in the page cache; this of course is dependent on knowing the name of the file in the cache directory.

Check the validity of a file pointer in C while the file is open

Well, usually an open file will prevent unmounting the filesystem, so it shouldn't just disappear under you. Though with USB disks etc, there is of course the possibility of the user pulling the device without asking the system.

But it would be nice of the process to not prevent clean unmounting.
That requires two things:

  1. Don't keep the file open
  2. Don't keep the containing directory as the processes working directory.

Running stat(2) periodically on the path would be the way to do this. You can detect changes to the file from modifications to mtime, ctime, the file size. Errors and changes in the inode number or the containing device (st_dev) might indicate the file is no longer accessible or isn't the same file any more. React depending on application requirements.

(That is, assuming you're interested in the file currently pointed to by that name, and not in the same inode you opened.)

As for I/O, it's likely that periodically stating something would keep the inode cached in memory, so the issue would be more about memory use than I/O. (Until you do this on enough to files to not be able to cache them, which leads to trashing, an issue of both memory and I/O...) Seeking to the end of the file would also similarly require loading the length of the file, I can't see why that would cause any significant I/O.

Another choice would be to use inotify(7) on the file or the whole directory to detect changes without polling. It can also detect unmount events.



Related Topics



Leave a reply



Submit