Does Each Unix File Description Have Its Own Read/Write Buffers

Does each Unix file description have its own read/write buffers?

This depends a bit on whether you are talking about sockets or actual files.

Strictly speaking, a descriptor never has its own buffers; it's just a handle to a deeper abstraction.

File system objects have their "own" buffers, at least when they are required. That is, if a program writes less than the file system block size, the kernel has no choice but to read a FS block and merge the write with the existing data.

This buffer is attached to the vnode and at a lower level, possibly an inode. It's owned by the file and not the descriptor. It may be kept around for a long time if memory is available.

In the case of a socket, then a stream, but not specifically a single descriptor, does actually have buffers that it owns.

Are Unix reads and writes to a single file atomically serialized?

Separate write() calls are processed separately, not as a single atomic write transaction, and interleaving is entirely possible when multiple processes/threads are writing to the same file. The order of the actual writes is determined by the schedulers (both kernel process scheduler, and for "green" threads the thread library's scheduler).

Unless you specify otherwise (O_DIRECT open flag or similar, if supported), read() and write() operate on kernel buffers and read() will use a loaded buffer in preference to reading the disk again.

Note that this may be complicated by local file buffering; for example, stdio and iostreams will read file data by blocks into a buffer in the process which is independent of kernel buffers, so a write() from elsewhere to data that are already buffered in stdio won't be seen. Likewise, with output buffering there won't be any actual kernel-level output until the output buffer is flushed, either automatically because it has filled up or manually due to fflush() or C++'s endl (which implicitly flushes the output buffer).

unix open file for write given file descriptor

Is there any other way to achive the same, i.e., write to a file given
a file descriptor

You can write directly using the system call write(2).

write(fd, "\n", 1);

Write to file descriptor and immediately read from it

The send() and recv() functions are for use with sockets (send: send a message on a socket — recv: receive a message from a connected socket). See also the POSIX description of Sockets in general.

Socket file descriptors are bi-directional — you can read and write on them. You can't read what you wrote, unlike with pipe file descriptors. With pipes, the process writing to the write end of a pipe can read what it wrote from the read end of the pipe — if another process didn't read it first. When a process writes on a socket, that information goes to the peer process and cannot be read by the writer.

Why do unbuffered read()/write() operations use buffer cache?

Basically the term "buffering" here means "a place where data is stored when going to/from the kernel", i.e. to avoid doing one system call for each I/O call, the buffered functions use a buffer between.

What the kernel does with the data is not something the standard library can do much about.

It would be possible to do a 1:1 mapping of read/write calls at the standard library's level (i.e. fread() and friends) to read()/write() calls on the underlying file descriptor; the term buffering is telling you that is not what you can expect.

When does actual write() takes place in C?

The write() system call (in fact all system calls) are nothing more that a contract between the application program and the OS.

  • for "normal" files, the write() only puts the data on a buffer, and marks that buffer as "dirty"
  • at some time in the future, these dirty buffers will be collected and actually written to disk. This can be forced by fsync()
  • this is done by the .write() "method" in the mounted-filesystem-table
  • and this will invoke the hardware's .write() method. (which could involve another level of buffering, such as DMA)
  • modern hard disks have there own buffers, which may or may not have actually been written to the physical disk, even if the OS->controller told them to.

Now, some (abnormal) files don't have a write() method to support them. Imagine open()ing "/dev/null", and write()ing a buffer to it. The system could choose not to buffer it, since it will never be written anyway.

Also note that the behaviour of write() does depend on the nature of the file; for network sockets the write(fd,buff,size) can return before size bytes have been sent(write will return the number of characters sent). But it is impossible to find out where they are once they have been sent. They could still be in a network buffer (eg waiting for Nagle ...), or a buffer inside the network interface, or a buffer in a router or switch somewhere on the wire.

When does the write() system call write all of the requested buffer versus just doing a partial write?

You need to check errno to see if your call got interrupted, or why write() returned early, and why it only wrote a certain number of bytes.

From man 2 write

When using non-blocking I/O on objects such as sockets that are subject to flow control, write() and writev() may write fewer bytes than requested; the return value must be noted, and the remainder of the operation should be retried when possible.

Basically, unless you are writing to a non-blocking socket, the only other time this will happen is if you get interrupted by a signal.

[EINTR] A signal interrupted the write before it could be completed.

See the Errors section in the man page for more information on what can be returned, and when it will be returned. From there you need to figure out if the error is severe enough to log an error and quit, or if you can continue the operation at hand!

This is all discussed in the book: Advanced Unix Programming by Marc J. Rochkind, I have written countless programs with the help of this book, and would suggest it while programming for a UNIX like OS.

Understanding `read, write` system calls in Unix

read() will read the bytes in without any interpretation (so "binary" mode).

Being binary, and you want to access the individual bytes, you should use a buffer of unsigned char
unsigned char buffer[BUFFER]. You can regard char/unsigned char as bytes, they'll be 8 bits on linux.

Now, since what you're dealing with is 8 bit ascii compressed down to 7 bit, you'll have to convert those 7 bits into 8 bits again so you can make sense of the data.

To explain what's been done - consider the text Hey .That's 3 bytes. The bytes will have 8 bits each, and in ascii that's the bit patterns :

01001000 01100101 01111001

Now, removing the most significant bit from this, you shift the remaining bits one bit to the left.

X1001000 X1100101 X1111001

Above, X is the bit to removed. Removing those, and shifting the others you end up with bytes with this pattern:

10010001 10010111 11001000

The rightmost 3 bits is just filled in with 0. So far, no space is saved though. There's still 3 bytes.
With a string of 8 bytes, we'd saved 1 byte as that would compress down to 7 bytes.

Now you have to do the reverse on the bytes you've read back in



Related Topics



Leave a reply



Submit