Avoid Copying of Data Between User and Kernel Space and Vice-Versa

Avoid copying of data between user and kernel space and vice-versa

You should use UDP, that is already pretty fast. At least it was fast enough for W32/SQLSlammer to spread through the whole internet.

About your initial question, see the (vm)splice and tee Linux system calls.

From the manpage:

The three system calls splice(2),
vmsplice(2), and tee(2)), provide
userspace programs with full control
over an arbitrary kernel buffer,
implemented within the kernel using
the same type of buffer that is used
for a pipe. In overview, these system
calls perform the following tasks:

splice(2)
  moves data from the buffer to an arbitrary file descriptor, or vice
versa, or from one buffer to another.

tee(2)
  "copies" the data from one buffer to another.
vmsplice(2)
  "copies" data from user space into the buffer.
Though we talk of copying, actual
copies are generally avoided. The
kernel does this by implementing a
pipe buffer as a set of
reference-counted pointers to pages of
kernel memory. The kernel creates
"copies" of pages in a buffer by
creating new pointers (for the output
buffer) referring to the pages, and
increasing the reference counts for
the pages: only pointers are copied,
not the pages of the buffer.

How is data copied from user space to kernel space and vice versa during I/O tasks?

Since fread calls read underneath, how many read function calls will be invoked respectively?

Because fread() is mostly just slapping a buffer (in user-space, likely in a shared library) in front of read(), the "best case number of read() system calls" will depend on the size of the buffer.

For example; with an 8 KiB buffer; if you read 6 bytes with a single fread(), or if you read 6 individual bytes with 6 fread() calls; then read() will probably be called once (to get up to 8 KiB of data into the buffer).

However; read() may return less data than was requested (and this is very common for some cases - e.g. stdin if the user doesn't type fast enough). This means that fread() might use read() to try to fill it's buffer, but read() might only read a few bytes; so fread() needs to call read() again later when it needs more data in its buffer. For a worst case (where read() only happens to return 1 byte each time) reading 6 bytes with a single fread() may cause read() to be called 6 times.

Is data transfer, whether one single byte or 1mb between user space buffer and kernel space buffer all done by the kernel and no user/kernel mode switch involved during transferring?

Often, read() (in the C standard library) calls some kind of "sys_read()" function provided by the kernel. In this case there's a switch to kernel when "sys_read()" is called, then the kernel does whatever it needs to to obtain and transfer the data, then there's one switch back from kernel to user-space.

However; nothing says that's how a kernel must work. E.g. a kernel could only provide a "sys_mmap()" (and not provide any "sys_read()") and the read() (in the C standard library) could use "sys_mmap()". For another example; with an exo-kernel, file systems might be implemented as shared libraries (with "file system cache" in shared memory) so a read() done by the C library (of a file's data that is in the "file system cache") may not involve the kernel at all.

How many disk accesses are performed respectively? Won't the kernel buffer come into play during scenario two?

There's too many possibilities. E.g.:

a) If you're reading from a pipe (where the data is in a buffer in the kernel and was previously written by a different process) then there will be no disk accesses (because the data was never on any disk to begin with).

b) If you're reading from a file and the OS cached the file's data already; then there may be no disk accesses.

c) If you're reading from a file and the OS cached the file's data already; but the file system needs to update meta-data (e.g. an "accessed time" field in the file's directory entry) then there may be multiple disk accesses that have nothing to do with the file's data.

d) If you're reading from a file and the OS hasn't cached the file's data; then at least one disk access will be necessary. It doesn't matter if it's caused by fread() attempting to read a whole buffer, read() trying to read all 6 bytes at once, or the OS fetching a whole disk block because of the first "read() of one byte" in a series of six separate "read() of one byte" requests. If the OS does no caching at all, then six separate "read() of one byte" requests will be at least 6 separate disk accesses.

e) file system code may need to access some parts of the disk to determine where the file's data actually is before it can read the file's data; and the requested file data may be split between multiple blocks/sectors on the disk; so reading 2 or more bytes from a file (regardless of whether it was caused by fread() or "read() of 2 or more bytes") could cause several disk accesses.

f) with a RAID 5/6 array involving 2 or more physical disks (where reading a "logical block" involves reading the block from one disk and also reading the parity info from a different disk), the number of disk accesses can be doubled.

The read function ssize_t read(int fd, void *buf, size_t count) also has buffer and count parameters, can these replace the role of user space buffer?

Yes; but if you're using it to replace the role of a user space buffer then you're mostly just implementing your own duplicate of fread().

It's more common to use fread() when you want treat the data as stream of bytes, and read() (or maybe mmap()) when you do not want to treat the data as a stream of bytes.

For a random example; maybe you're working with a BMP file; so you read the "guaranteed to be 14 bytes by the file format's spec" header; then check/decode/process the header; then (after determining where it is in the file, how big it is and what format it's in) you might seek() to the pixel data and read all of it into an array (then maybe spawn 8 threads to process the pixel data in the array).

copy data from kernel space to user space

The function copy_to_user is used to copy data from the kernel address space to the address space of the user program. For example, to copy a buffer which has been allocated with kmalloc to the buffer provided by the user.

EDIT: Your example is a little bit more complex, because you pass an array of pointers to the system-call. To access these pointers
you have to copy the array buf to kernel space first using copy_from_user.

Thus, your kernel code should look like this:

asmlinkage long sys_something(buffer **buf, int size)
{
    /* Allocate buffers_in_kernel on stack just for demonstration.
     * These buffers would normally allocated by kmalloc.
     */
    buffer buffers_in_kernel[size];
    buffer *user_pointers[size]; 
    int i;
    unsigned long res;

    /* Fill buffers_in_kernel with some data */
    for (i = 0; i < size; i++)
        buffers_in_kernel[i].n = i;  /* just some example data */

    /* Get user pointers for access in kernel space. 
     * This is a shallow copy, so that, the entries in user_pointers 
     * still point to the user space.
     */
    res = copy_from_user(user_pointers, buf, sizeof(buffer *) * size);
    /* TODO: check result here */

    /* Now copy data to user space. */
    for (i = 0; i < size; i++) {
         res = copy_to_user(user_pointers[i], &buffers_in_kernel[i], sizeof(buffer));
         /* TODO: check result here */
    }
}

Last but not least, there is a mistake in your main function. At the first malloc call, it allocates only enough space for 1 pointer instead of 8. It should be:

int main(void)
{
  const int size = 8;
  buffer **buf = malloc(sizeof(buffer *) * size);
  for(int i=0; i<size; i++) buf[i] = malloc(sizeof(buffer));
  long int sys = systemcall(801,buf,size)
  //print out buf 
  return 0;
}

what situations when to read data out of kernel space to user space?

The operating system's job is allow a lot of components, both hardware and software, to play nice with each other. In general, userland programs can't directly manipulate peripherals nor interfere with each other. I'm not familiar with the specific setup that you're citing, but it doesn't sound unusual.

The USB camera notifies the operating system that it has a new frame. When the kernel (driver) notices this it, will copy the frame with I/O commands into RAM. Since this RAM was allocated by the driver, the userland programs won't be able to see or read it due to virtual memory. To summarise it quickly, the address &0x1000 in the kernel and the address &0x1000 in a program are actually physically distinct locations in RAM. The kernel will then copy the frame into the memory of any process that is expecting input from the camera and then notify it (in this case catusb).

Likewise, since xform, detect and hdinput exist as separate processes, they must use inter-process communication. Since the operating system must ensure the isolation of the programs, each process will leverage the kernel to achieve this.

There's nothing unusual here. I imagine they are just spelling it out because gesture recognition is time-critical and doing it this way has some overhead.

The implementation of copy_from_user()

"Before this it's better to know why copy_from_user() is used"

Because the Kernel never allow a user space application to access Kernel memory directly, because if the memory pointed is invalid or a fault occurs while reading, this would the kernel to panic by just simply using a user space application.

"And that's why!!!!!!"

So while using copy_from_user is all that it could create an error to the user and it won't affect the kernel functionality

Even though it's an extra effort it ensures the safe and secure operation of Kernel

linux kernel space and user space communication with high efficiency

In my project, evevtually I choosed netlink for communication(transfer command) and mmap for sending data.There are some ways for communication between linux kernel and user's space.Click here.The best way was netlink for my project.The signal was bad, because it could only send 4B data every time!

Avoid Copying of Data Between User and Kernel Space and Vice-Versa