Why Pid of a Process Is Represented by Opaque Data Type

Why PID of a process is represented by opaque data type?

That's not really an opaque type, but an alias to an integer type. For example, in my system, I find the following in different header files:

typedef __pid_t pid_t;
...
# define __STD_TYPE typedef
__STD_TYPE __PID_T_TYPE __pid_t; /* Type of process identifications. */
...
#define __PID_T_TYPE __S32_TYPE
...
#define __S32_TYPE int

Hence, you're right in that pid_t is just an int. However, I'd say there are a couple of reasons to do this:

  • Readability: make clear that a variable is going to be used as a pid (wikipedia reference).
  • Maintainability: make sure that the type of all pid variables can be changed in the future if needed. For example, if pids need a wider data type later (such as long int), you just need to change the typedef, recompile and everything should work fine. In fact, I believe this already happens for different architectures.

What are file descriptors, explained in simple terms?

In simple words, when you open a file, the operating system creates an entry to represent that file and store the information about that opened file. So if there are 100 files opened in your OS then there will be 100 entries in OS (somewhere in kernel). These entries are represented by integers like (...100, 101, 102....). This entry number is the file descriptor.
So it is just an integer number that uniquely represents an opened file for the process.
If your process opens 10 files then your Process table will have 10 entries for file descriptors.

Similarly, when you open a network socket, it is also represented by an integer and it is called Socket Descriptor.
I hope you understand.

Is there any reasonable initial/invalid value for a pid_t?

pid_t is not opaque. Zero and all negative values are explicitly exceptional (used e.g. by waitpid to represent particular classes of processes to wait for) and arguably 1 is exceptional too since the specialness of -1 prevents 1 from being a process group id you can use normally (traditionally, pid 1 is the init process).

For your purpose, 0 seems like the most reasonable choice.

Linux process ID and thread ID

If you want to store pid_t and pthread_t anywhere, you should use their respective types (i.e. "pid_t" and "pthread_t"). So if you want to store them in shared memory somewhere, do a memcpy() to get them there.

As far as identifying specific threads by combinations of PID and TID, see Nemo's comment.

If you do make the assumption that they will exist, you can have your program look at /proc to find the appropriate pid's directory, and looking in /proc/<pid>/task for the threads.

what is the value range of thread and process id?

The pthread_t type is completely opaque. You can only compare it for equality with the pthread_equal function, and there is no reserved value distinct from any valid thread id, though such a value will probably be added to the next version of the POSIX standard. As such, you'll need to store a second field alongside the thread id to track whether it's valid or not.

Difference between PID and TID

It is complicated: pid is process identifier; tid is thread identifier.

But as it happens, the kernel doesn't make a real distinction between them: threads are just like processes but they share some things (memory, fds...) with other instances of the same group.

So, a tid is actually the identifier of the schedulable object in the kernel (thread), while the pid is the identifier of the group of schedulable objects that share memory and fds (process).

But to make things more interesting, when a process has only one thread (the initial situation and in the good old times the only one) the pid and the tid are always the same. So any function that works with a tid will automatically work with a pid.

It is worth noting that many functions/system calls/command line utilities documented to work with pid actually use tids. But if the effect is process-wide you will simply not notice the difference.

Erlang distributed message sending - what is the meaning of the first atom in the tuple?

The grammar is a bit ambiguous in the sentence you're citing. The three options are:

  • A process ID, which is an opaque data type returned from certain Erlang functions, primarily spawn and spawn_link.
  • A registered name on the local node (i.e., the local VM). An example of where this would be needed would be a long-running server application, where you want processes to be able to communicate with a key utility service, such as a DNS cache.
  • A tuple containing both a registered name and the name of the node it lives on (if another VM, potentially on a different host).

The first is by far the most common. Registered names are intended to be used judiciously.

I'd recommend starting with the concurrency chapter from Learn You Some Erlang, and backtracking as necessary to earlier chapters:
http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency#dont-panic

Data Type - socklen_t, sa_family_t

By declaring specific types for these fields, it decouples them from a particular representation like unsigned int.

Different architectures can be free to define different sizes for these fields, and code that uses these specific types doesn't need to worry about how big an int is on a given machine.



Related Topics



Leave a reply



Submit