Distinguishing Between Java Threads and Os Threads

Distinguishing between Java threads and OS threads?

On Linux, Java threads are implemented with native threads, so a Java program using threads is no different from a native program using threads. A "Java thread" is just a thread belonging to a JVM process.

On a modern Linux system (one using NPTL), all threads belonging to a process have the same process ID and parent process ID, but different thread IDs. You can see these IDs by running ps -eLf. The PID column is the process ID, the PPID column is the parent process ID, and the LWP column is the thread (LightWeight Process) ID. The "main" thread has a thread ID that's the same as the process ID, and additional threads will have different thread ID values.

Older Linux systems may use the "linuxthreads" threading implementation, which is not fully POSIX-compliant, instead of NPTL. On a linuxthreads system, threads have different process IDs.

You can check whether your system is using NPTL or linuxthreads by running the system's C library (libc) as a standalone program and looking under "Available extensions" in its output. It should mention either "Native POSIX Threads Library" or linuxthreads. The path to the C library varies from system to system: it may be /lib/libc.so.6, /lib64/libc.so.6 (on 64-bit RedHat-based systems), or something like /lib/x86_64-linux-gnu/libc.so.6 (on modern Debian-based systems such as Ubuntu).

At the OS level, theads don't have names; those exist only within the JVM.

The pthread_kill() C function can be used to send a signal to a specific thread, which you could use to try to kill that specific thread from outside the JVM, but I don't know how the JVM would respond to it. It might just kill the whole JVM.

Java Threads vs OS Threads

Each Thread will be given a native OS
Thread to RUN which can run on a
different CPU but since Java is
interpreted these threads will require
to interact with the JVM again and
again to convert the byte code to
machine instructions ? Am I right ?

You are mixing two different things; JIT done by the VM and the threading support offered by the VM. Deep down inside, everything you do translates to some sort of native code. A byte-code instruction which uses thread is no different than a JIT'ed code which accesses threads.

If yes, than for smaller programs Java
Threads wont be a big advantage ?

Define small here. For short lived processes, yes, threading doesn't make that big a difference since your sequential execution is fast enough. Note that this again depends on the problem being solved. For UI toolkits, no matter how small the application, some sort of threading/asynchronous execution is required to keep the UI responsive.

Threading also makes sense when you have things which can be run in parallel. A typical example would be doing heavy IO in on thread and computation in another. You really wouldn't want to block your processing just because your main thread is blocked doing IO.

Once the Hotspot compiles both these
execution paths both can be as good as
native Threads ? Am I right ?

See my first point.

Threading really isn't a silver bullet, esp when it comes to the common misconception of "use threads to make this code go faster". A bit of reading and experience will be your best bet. Can I recommend getting a copy of this awesome book? :-)

@Sanjay: Infact now I can reframe my
question. If I have a Thread whose
code has not been JIT'd how does the
OS Thread execute it ?

Again I'll say it, threading is a completely different concept from JIT. Let's try to look at the execution of a program in simple terms:

java pkg.MyClass -> VM locates method
to be run -> Start executing the
byte-code for method line by line ->
convert each byte-code instruction to
its native counterpart -> instruction
executed by OS -> instruction executed
by machine

When JIT has kicked in:

java pkg.MyClass -> VM locates method
to be run which has been JIT'ed ->
locate the associated native code
for that method -> instruction
executed by OS -> instruction executed
by machine

As you can see, irrespective of the route you follow, the VM instruction has to be mapped to its native counterpart at some point in time. Whether that native code is stored for further re-use or thrown away if a different thing (optimization, remember?).

Hence to answer your question, whenever you write threading code, it is translated to native code and run by the OS. Whether that translation is done on the fly or looked up at that point in time is a completely different issue.

Concept of JVM thread and relation to OS thread

It is true that the system can run only 8 threads simultaneously, but the operating system schedules which of the eight are running at any given time and can both preempt and time slice processes to schedule other (waiting) processes for some portion of time. Java threads are isomorphic to native threads, so it's literally the operating system scheduling them (and if your computer worked like you thought, the network would stop working while your program ran).

Communication between Java thread and OS threads

Many mix up threads and processes here, the jvm is a process which may spawn more threads. Threads are lighter processes which share memory within their process. A process on the other hand lives in his own address space, which makes the context switch more expensive. You can communicate between different processes via the IPC mechanisms provided by your OS and you can communicate between different threads within the same process due to shared memory and other techniques. What you can't is communicate from ThreadA(ProcessA) to ThreadA(ProcessB) without going through plain old IPC: ThreadA(ProcessA) -> ProcessA -> IPC(OS) -> ProcessB -> ThreadA(ProcessB)).

You can use RMI to communicate between two java processes, if you want to "talk" to native OS processes, you have to go JNI to call the IPC mechanisms your OS of choice provides imo.

Feel free to correct me here :)

Sidenote:
You cant see the threads of your JVM with a process manager (as long as your JVM does not map threads to native processes, which would be stupid but possible), you need to use jps and jstack to do that.

software threads vs hardware threads

Software threads are threads of execution managed by the operating system.

Hardware threads are a feature of some processors that allow better utilisation of the processor under some circumstances. They may be exposed to/by the operating system as appearing to be additional cores ("hyperthreading").

In Java, the threads you create maintain the software thread abstraction, where the JVM is the "operating system". Whether the JVM then maps Java threads to OS threads is the JVM's business (but it almost certainly does). And then the OS will be using hardware threads if they are available.

What is the difference between a process and a thread?

Both processes and threads are independent sequences of execution. The typical difference is that threads (of the same process) run in a shared memory space, while processes run in separate memory spaces.

I'm not sure what "hardware" vs "software" threads you might be referring to. Threads are an operating environment feature, rather than a CPU feature (though the CPU typically has operations that make threads efficient).

Erlang uses the term "process" because it does not expose a shared-memory multiprogramming model. Calling them "threads" would imply that they have shared memory.

Distinguishing Between Java Threads and Os Threads