Does Linux Time Division Processes or Threads

Is a schedulable unit of CPU time slice process or thread?

Clarification: my understanding of "a schedulable unit of CPU time slice" is "a unit that can be scheduled during a given CPU time slice" (since if "schedulable unit" would be a time then the question does not make much sense to me).

Based on this, put it shortly, "a schedulable unit of CPU time slice" for a given logical core can be seen as a software thread (more specifically its execution context composed of registers and process information).

Operating systems scheduler operates on tasks. Tasks can be threads, processes, or other unusual structure (eg. dataflows).

Modern mainstream operating system mainly schedule threads on processing units (typically hardware threads also called logical cores). You can get more information about how the Windows scheduler works in the Microsoft documentation. The documentation explicitly states:

A thread is the entity within a process that can be scheduled for execution

On Linux, the default scheduler, CFS, operates on task (ie. task_struct data structure). Tasks can be a thread, a group of threads or a process. This was done that way so to make the scheduler more generic and also because this scheduler was designed long ago, when processors had only 1 core and people focused on processes rather than thread. The multi-core era since caused applications to use a lot of threads so to use available cores. As a result, nowadays, it is generally threads that are actually scheduled AFAIK. This is explained in the famous research paper The Linux Scheduler: a Decade of Wasted Cores (which also explain a bit how the CFS operate regarding the target processor).

Note that the term "process" can sometime refer to a thread since threads are sometime called "lightweight processes" and basic processes are sometime called "heavy processes". Processes can even be a generic term for both heavy and lightweight processes (ie. threads and actual processes). This is a very confusing terminology and a misuse of language (like the term "processors" sometimes used for cores). In practice, this is often not a problem in a specific context since threads and processes may be used interchangeably though (in such a case, people should use a generic term like "tasks").

As for "a schedulable unit of CPU time slice" this is a bit more complex. A simple and naive answer is: a thread (it is definitively not processes alone). That being said, a thread is a software-defined concept (like processes). It is basically a stack, few registers, and a parent process (with possibly some meta-information and a TLS space). CPUs does not operate directly on such data structure. CPU does not have a concept of thread stack for example (it is just a section of the virtual process memory like any other). They just need an execution context which is composed of registers and a process configuration (in protected mode). For sake of simplicity, we can say that they execute threads. Mainstream modern x86 processors are very complex, and each core is often able to run multiple threads at the same time. This is called simultaneous multithreading (aka. Hyper-Threading for Intel processors). x86 physical cores are typically composed of two logical threads (ie. hardware threads) that can each execute a software threads.

What is the differences and relationships between process , threads , task and jobs in Linux?

The distinction between process and thread is fairly universal to all operating systems. A process usually represents an independent execution unit with its own memory area, system resources and scheduling slot.

A thread is typically a "division" within the process - threads usually share the same memory and operating system resources, and share the time allocated to that process. For example, when you open your Browser and Microsoft Word, each is a different process, but things that happen in the background of each (like animations, refreshes or backups) can be threads.

A job is usually a long-running unit of work executed by a user. The job may be "handled" by one or more processes. It might not be interactive. For instance, instructing the machine to zip a large file or to run some processing script on a large input file would typically be a job. The naming is relatively historic - Mainframes used to process jobs. In UNIX systems, many jobs are started automatically at prescheduled times using cron, so you have the notion of 'cron jobs'.

Is it faster to run one process that spawns N threads or to run N processes?

it is faster to lauch a single instance by far. Threads are made for that purpose, and they are lighter that processes. The de-facto rule is: let the OS do the scheduling and memory management unless you need to do the dirty job by yourself. This way your code will be much simpler and cleaner. The OS has a bunch of lower level tools to handle processes and memory much more efficiently. Of course it will depend on the OS, but this is a general rule for modern OS, and at least the one i use (Linux).

Why processes are deprived of CPU for TOO long while busy looping in Linux kernel?

The schedule() function simply invokes the scheduler - it doesn't take any special measures to arrange that the calling thread will be replaced by a different one. If the current thread is still the highest priority one on the run queue then it will be selected by the scheduler once again.

It sounds as if your kernel thread is doing very little work in its busy loop and it's calling schedule() every time round. Therefore, it's probably not using much CPU time itself and hence doesn't have its priority reduced much. Negative nice values carry heavier weight than positives, so the difference between a -5 and a 0 is quite pronounced. The combination of these two effects means I'm not too surprised that user space processes miss out.

As an experiment you could try calling the scheduler every Nth iteration of the loop (you'll have to experiment to find a good value of N for your platform) and see if the situation is better - calling schedule() too often will just waste lots of CPU time in the scheduler. Of course, this is just an experiment - as you have already pointed out, avoiding busy loops is the correct option in production code, and if you want to be sure your thread is replaced by another then set it to be TASK_INTERRUPTIBLE before calling schedule() to remote itself from the run queue (as has already been mentioned in comments).

Note that your kernel (2.6.18) is using the O(1) scheduler which existed until the Completely Fair Scheduler was added in 2.6.23 (the O(1) scheduler having been added in 2.6 to replace the even older O(n) scheduler). The CFS doesn't use run queues and works in a different way, so you might well see different behaviour - I'm less familiar with it, however, so I wouldn't like to predict exactly what differences you'd see. I've seen enough of it to know that "completely fair" isn't the term I'd use on heavily loaded SMP systems with a large number of both cores and processes, but I also accept that writing a scheduler is a very tricky task and it's far from the worst I've seen, and I've never had a significant problem with it on a 4-8 core desktop machine.

Functionality implementation: Processes or Threads division?

There are a several reasons why going down the separate process route might is a good choice in an embedded system:

Decoupling of component: running components as seperate processes is the ultimate decoupling. Often useful when projects become very large
Security and privilege management: Quite likely in an embedded system that some components need elevated privilege in order to control devices, whereas others are potential security hazards (for instance network facing components) you want to run with as little as little privilege as possible. Other likely scenarios are components that need real-time threading or to be able to mmap() a lot of system memory. Overallocation of either will lock your system up in a way it won't recover from.
Reliably: You can potentially respawn parts of the system if they fail leaving the remainder running

Building such an arrangement is actually easier than others here are suggesting - Qt has really good support for dbus - which nicely takes care of your IPC, and is used extensive in the Linux desktop for system management functionality.

As for the scenario you describe, you might want to daemonise the 'core' of the application and then allow client connections over dbus from UI components.

Do multiple threads in one process share the same execution time as one process with one thread?

Thread switches are always on the same interval regardless of process ownership. So if it's 100micro then it's always 100micro. Unless of course the thread itself surrenders execution. When this thread is going to run again is where things get complicated

Thread (not processes) scheduling based on priority

There's no problem, this is expected behavior.

First, if you have more than one core, then the priorities won't matter if there are fewer ready-to-run threads than cores -- each thread will get its own core.

Second, your high-priority thread sleeps, which gives the lower-priority thread time to run.

Third, your threads interact through the lock that protects standard output. The higher-priority thread can be waiting for that lock, allowing lower-priority threads to run.

Please don't try to use priorities this way. It adds massive complexity, hurts performance, and rarely accomplishes anything useful.

Does all single threaded software run on the main core/thread?

A thread always belongs to a process, so if you have 10 instances of a software running on 10 processes they will run on the main thread of each process. Then the processor will assign each thread to an specific core depending on the available resources