Interprocess Communication via File

Interprocess Communication via file

If you want use a file to communicate with another process, you should have a look at man fifo.

I'll report here just the first lines:

NAME
       fifo - first-in first-out special file, named pipe

DESCRIPTION
       A FIFO special file (a named pipe) is similar to a pipe, except that it
       is accessed as part of the file system.  It can be opened  by  multiple
       processes  for  reading or writing.  When processes are exchanging data
       via the FIFO, the kernel passes all data internally without writing  it
       to the file system.  Thus, the FIFO special file has no contents on the
       file system; the file system entry merely serves as a  reference  point
       so that processes can access the pipe using a name in the file system.

I think this is what you need.

Just think to it as a buffer. It must be opened both for reading and for writing by different process. The process who's reading will be blocked until the writing process doesn't write on it. When the writing process finish to write, close the file and that is the green light for the reading process to start empty the buffer. It's a FIFO, so the first line written will be the first line read. Then the writing process can open it again and they start again.

You can create a FIFO with mkfifo. Have a look to man mkfifo.

Interprocess communication via Pipes

The sending process can write until the pipe buffer is full (64k on Linux since 2.6.11). After that, write(2) will block.

The receiving process will block until data is available to read(2).

For a more detailed look into pipe buffering, look at https://unix.stackexchange.com/a/11954.

For example, this program

#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

int
main(int argc, char *argv[])
{
    int pipefd[2];
    pid_t cpid;
    char wbuf[32768];
    char buf[16384];

    /* Initialize writer buffer with 012...89 sequence */
    for (int i = 0; i < sizeof(wbuf); i++)
      wbuf[i] = '0' + i % 10;

    if (pipe(pipefd) == -1) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    cpid = fork();
    if (cpid == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    }

    if (cpid == 0) {    /* Child reads from pipe */
        close(pipefd[1]);          /* Close unused write end */
        while (read(pipefd[0], &buf, sizeof(buf)) > 0);
        close(pipefd[0]);
        _exit(EXIT_SUCCESS);

    } else {            /* Parent writes sequence to pipe */
        close(pipefd[0]);          /* Close unused read end */
        for (int i = 0; i < 5; i++)
          write(pipefd[1], wbuf, sizeof(wbuf));
        close(pipefd[1]);          /* Reader will see EOF */
        wait(NULL);                /* Wait for child */
        exit(EXIT_SUCCESS);
    }
}

will produce the following sequence when run with gcc pipes.c && strace -e trace=open,close,read,write,pipe,clone -f ./a.out:

open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
close(3)                                = 0
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 832) = 832
close(3)                                = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f32117489d0) = 21114
close(3)                                = 0
write(4, "01234567890123456789012345678901"..., 32768) = 32768
write(4, "01234567890123456789012345678901"..., 32768) = 32768
write(4, "01234567890123456789012345678901"..., 32768strace: Process 21114 attached
 <unfinished ...>
[pid 21114] close(4)                    = 0
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3,  <unfinished ...>
[pid 21113] <... write resumed> )       = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] write(4, "01234567890123456789012345678901"..., 32768 <unfinished ...>
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3,  <unfinished ...>
[pid 21113] <... write resumed> )       = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] write(4, "01234567890123456789012345678901"..., 32768 <unfinished ...>
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3,  <unfinished ...>
[pid 21113] <... write resumed> )       = 32768
[pid 21114] <... read resumed> "45678901234567890123456789012345"..., 16384) = 16384
[pid 21113] close(4)                    = 0
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, "45678901234567890123456789012345"..., 16384) = 16384
[pid 21114] read(3, "01234567890123456789012345678901"..., 16384) = 16384
[pid 21114] read(3, "45678901234567890123456789012345"..., 16384) = 16384
[pid 21114] read(3, "", 16384)          = 0
[pid 21114] close(3)                    = 0
[pid 21114] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21114, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++

You'll notice that the reads and writes are interleaved and that the writing and reading processes will block a few times as either the pipe is full or not enough data is available for reading.

Inter-process communication using physical text files

Modern OS support many different IPC. Pipes, named pipes, sockets, memory mapped files, ... The choice of one solution or the other is very dependent of your application. But broadly speaking, all of them should be "better" than using a plain-old-file.

As IPC are objects managed by the OS, they are not dependent of the language used to write the various process. Some IPC have a file semantic (pipes, named pipes). Other require the use of some dedicated system primitive (mmap). But C++ and Python (and many other language) will support the required system call. In fact, IPC are great to help software written in different languages to speak together.

Using files for for shared memory IPC

Essentially, I'm trying to understand what happens when two processes have the same file open at the same time, and if one could use this to safely and performantly offer communication between to processes.

If you are using regular files using read and write operations (i.e. not memory mapping them) then the two processes do not share any memory.

User-space memory in the Java Buffer objects associated with the file is NOT shared across address spaces.
When a write syscall is made, data is copied from pages in one processes address space to pages in kernel space. (These could be pages in the page cache. That is OS specific.)
When a read syscall is made, data is copied from pages in kernel space to pages in the reading processes address space.

It has to be done that way. If the operating system shared pages associated with the reader and writer processes buffers behind their backs, then that would be an security / information leakage hole:

The reader would be able to see data in the writer's address space that had not yet been written via write(...), and maybe never would be.
The writer would be able to see data that the reader (hypothetically) wrote into its read buffer.
It would not be possible to address the problem by clever use of memory protection because the granularity of memory protection is a page versus the granularity of read(...) and write(...) which is as little as a single byte.

Sure: you can safely use reading and writing files to transfer data between two processes. But you would need to define a protocol that allows the reader to know how much data the writer has written. And the reader knowing when the writer has written something could entail polling; e.g. to see if the file has been modified.

If you look at this in terms of just the data copying in the communication "channel"

With memory mapped files you copy (serialize) the data from application heap objects to the mapped buffer, and a second time (deserialize) from the mapped buffer to application heap objects.
With ordinary files there are two additional copies: 1) from the writing processes (non-mapped) buffer to kernel space pages (e.g. in the page cache), 2) from the kernel space pages to the reading processes (non-mapped) buffer.

The article below explains what is going on with conventional read / write and memory mapping. (It is in the context of copying a file and "zero-copy", but you can ignore that.)

Reference:

Zero Copy I: User-Mode Perspective

Process communication through file in bash

I was able to solve the problem with the following code:

Process 2 reading the file for updates:

filename="testfile.txt"
tail -n 0 -F $filename | \
while read LINE
do
echo "$LINE"
done

Process 1 writes to the file using simple >>

The tail command is what solved my issue.

InterProcess Communication using Pipes and Files

Could "IPC using files" be just one process writing a file and another process reading it. Examples of this would be writing files in /tmp or in /var. In the /var directory there are logs, locks and running PIDs. You can also use the /proc file system to talk to the kernel or /sys to talk to device drivers. These are all "IPC using files".

Softwaredevelopment patterns for Files and Caches as Inter-Process-Communication

To get notification of a change using IPCs on a Unix or Linux system, I could think of different technique without resorting to polling.

1st Blocking Read

One could use a file descriptor (a file, a Unix socket, a pipe or named pipe, an IPC queue, etc.). The consumer will read this file in a blocking way (a timeout is perhaps advised). The producer will write something to this file descriptor once it has updated the shared memory. Thus the consumer would get awaken from the read when this happen and go read the shared memory. The consumer will be in sleep state, so not consuming resources (like CPU) while waiting.

One could even use select for the consumer to wait for several file descriptor at the same time.

2nd Signals

The consumer would be awaken by a signal sent by the producer. The producer sends a signal (SIGUSR1 for instance) to the consumer once it has updated the shared memory. The consumer had subscribed previously to the signal and can handle the request. It is a bit more complex to handle because the even can be triggered at any time, so the design of the consumer is more difficult.

Interprocess Communication via File