Task Management on X86

Task management on x86

Edited to add your actual answer:

Protected Mode Software Architecture

Tom Shanley

Addison-Wesley Professional (March 16, 1996)

ISBN-10: 020155447X

ISBN-13: 978-0201554472

googlebook, amazon

My answer

Have you looked at "Understanding the Linux Kernel," 3rd Edition? It's available via Safari, and it's probably a good place to start for the OS side of things -- I don't think it gives you nitty-
gritty details, but it's an excellent guide that would probably put the linux kernel source and architecture-specific stuff into context. The following chapters give you the narrative you're asking for from the kernel side ("relationship between the hardware and the OS when an interrupt or context-switch occurs"):

Chapter 3: Processes
Chapter 4: Interrupts and Exceptions
Chapter 7: Process Scheduling

Understanding the Linux Kernel, 3rd Ed.

Daniel P. Bovet; Marco Cesati

Publisher: O'Reilly Media, Inc.

Pub. Date: November 17, 2005

Print ISBN-13: 978-0-596-00565-8

Print ISBN-10: 0-596-00565-2

Safari, Amazon

My recommendation is a book like this, with the linux source code and the intel manuals and a full fridge of beer, and you'll be off and running.

A brief snippet from Chapter 3: Processes, to whet your appetite:

3.3.2. Task State Segment

The 80×86 architecture includes a specific segment type called the Task State Segment (TSS), to store hardware contexts. Although Linux doesn't use hardware context switches, it is nonetheless forced to set up a TSS for each distinct CPU in the system. This is done for two main reasons:
When an 80×86 CPU switches from User Mode to Kernel Mode, it fetches the address of the Kernel Mode stack from the TSS (see the sections "Hardware Handling of Interrupts and Exceptions" in Chapter 4 and "Issuing a System Call via the sysenter Instruction" in Chapter 10).

When a User Mode process attempts to access an I/O port by means of an in or out instruction, the CPU may need to access an I/O Permission Bitmap stored in the TSS to verify whether the process is allowed to address the port.
More precisely, when a process executes an in or out I/O instruction in User Mode, the control unit performs the following operations:
It checks the 2-bit IOPL field in the eflags register. If it is set to 3, the control unit executes the I/O instructions. Otherwise, it performs the next check.
It accesses the tr register to determine the current TSS, and thus the proper I/O Permission Bitmap.
It checks the bit of the I/O Permission Bitmap corresponding to the I/O port specified in the I/O instruction. If it is cleared, the instruction is executed; otherwise, the control unit raises a "General protection " exception.
The tss_struct structure describes the format of the TSS. As already mentioned in Chapter 2, the init_tss array stores one TSS for each CPU on the system. At each process switch, the kernel updates some fields of the TSS so that the corresponding CPU's control unit may safely retrieve the information it needs. Thus, the TSS reflects the privilege of the current process on the CPU, but there is no need to maintain TSSs for processes when they're not running.

Another potential reference in the same vein is this one, which does have a lot more x86-specific stuff, and you might benefit a bit from the contrast w/ PowerPC.
Linux® Kernel Primer, The: A Top-Down Approach for x86 and PowerPC Architectures

Claudia Salzberg Rodriguez; Gordon Fischer; Steven Smolski

Publisher: Prentice Hall

Pub. Date: September 19, 2005

Print ISBN-10: 0-13-118163-7

Print ISBN-13: 978-0-13-118163-2

Safari, Amazon

Finally, Robert Love's Linux Kernel Development, 3rd Edition, has a pretty thorough description of context switching, though it may be redundant with the above. It's a pretty fantastic resource.

CPU Registers and Multitasking

how the CPU registers work with Multitasking.

CPU basically may not work with multitasking, task switch may be (and is) implemented in software. Some CPU (intel x86) may have hardware state (TSS https://en.wikipedia.org/wiki/Task_state_segment) and TR (task register) to atomically change state from one task to another. (TSS may still be used to switch protection rings ring0/ring3; but not to switch tasks.)

So in a Multitasking System, CPU can pause the execution of a certain program at any time and run another program.

Almost.

In most CPU, capable of running OS and user-space tasks, there are interrupts which are used in case of some external event (signal from hardware, interrupt request, IRQ) to pause execution of current code (task) and to jump to one of several special kernel functions, called interrupt handlers (ISR, interrupt service routine).

So how are the register values preserved during this step ?

Most registers just stays untouched on interrupt entry. Some are preserved by CPU in interrupt entry process.

Interrupt support mechanism is implemented inside CPU, it does save some cpu registers of current task to some space (yes, it can save to stack, x86 cpu pushes *FLAGS, CS, IP). CPU jumps to these routines, registered in Interrupt Vector Table / Interrupt Descriptor Table (IDT) arrays (stored at special memory location in case of IVT or pointed by special CPU register IDT for IDT); for IRQ 1 it will select record number 1 (routine 1); for IRQ 20 it will select routine 20.

Are the registers pushed to stack or any other way ?

Both. Some are pushed by CPU in interrupt, and some are pushed by commands of interrupt handler. (And there is also code to change EIP in kernel mode...)

ISR routine knows what was saved by CPU, and if it want to use some more registers (it basically wants), it will push to stack (or save to other place) other user registers, then do its interrupt work, then restore manually saved registers back, and then will exit from interrupt using special command (iret in x86 word, it will reload "CS:IP and FLAGS" back from stack). If original IP was untouched in the stack, CPU will continue execution of original user code (original task), and it will have all registers unchanged. If author or interrupt handler wants, he may change IP address in the stack before doing iret, so it will return in other place (it may return to kernel mode or to some task or reboot PC and so on).

Changing current running task (context switch, wiki, osdev) is one of problems which may be done using interrupts. There are basically two kinds of context switching: involuntary (when the task just runs and do not want to leave CPU) and voluntary (when task asks OS to do the context switch, either because current task may no run further or if current task is polite and gives other tasks chance to run - sched_yield in linux world).

Involuntary context switch is usually done with help of periodic timer interrupt. Several years ago this timer used to generate some specific IRQ (timer/RTC IRQ) every 10, 1 or 3 milliseconds; and timer interrupt handler calls OS scheduler to decide is current task runs for too much time (exceeds its time slice, the quantum of time-sharing) or is there some task with higher priority ready to run.

Voluntary context switch is usually done with help of system call (when OS used privilege separation, runs user-space code in ring3 and kernel-space code in ring0 - concept of protection rings), which is special CPU instruction to switch between privilege levels (it may be / was implemented with software-generated interrupts; int 80h in older linux/BSD). User-space code asks kernel to do some work, for example read from file or socket or write to file. It also may ask kernel to run scheduler to switch to other task if there is any - with sched_yield syscall. If syscall code decides that request need some time and the task can't run further before the request is done (system call blocks - blocks current task from running; switches it state from TASK_RUNNING), it will also call OS scheduler.

OS scheduler may keep current task as running (if it is in runnable state) or may decide to switch to other task (actually to kernel-mode of other task; there also will be return-from-syscall to restore task's CS:IP+FLAGS) and do it using switch_to asm macro: http://lxr.free-electrons.com/source/arch/x86/include/asm/switch_to.h?v=4.6#L27:

 33         /*                                                              \
 34          * Context-switching clobbers all registers, so we clobber      \
 35          * them explicitly, via unused output variables.                \
 36          * (EAX and EBP is not listed because EBP is saved/restored     \
 37          * explicitly for wchan access and EAX is the return value of   \
 38          * __switch_to())                                               \
 39          */                                                             \
 40         unsigned long ebx, ecx, edx, esi, edi;                          \
 41                                                                         \
 42         asm volatile("pushfl\n\t"               /* save    flags */     \
 43                      "pushl %%ebp\n\t"          /* save    EBP   */     \
 44                      "movl %%esp,%[prev_sp]\n\t"        /* save    ESP   */ \
 45                      "movl %[next_sp],%%esp\n\t"        /* restore ESP   */ \
 46                      "movl $1f,%[prev_ip]\n\t"  /* save    EIP   */     \
 47                      "pushl %[next_ip]\n\t"     /* restore EIP   */     \
 48                      __switch_canary                                    \
 49                      "jmp __switch_to\n"        /* regparm call  */     \
 50                      "1:\t"                                             \
 51                      "popl %%ebp\n\t"           /* restore EBP   */     \
 52                      "popfl\n"                  /* restore flags */     \
 53                                                                         \

If there was the only non-sleeping user task and it goes to sleep, there are no visible tasks ready to be run. But in fact there is invisible task with pid 0, sometimes called swapper or idle, which has lowest priority and always ready to run. It will run some special CPU instruction in loop to cool down the CPU - HLT; it may also do some checks for events / calls to scheduler to find runnable tasks.

Some strange realtime OS (with name starting from "V" and ending in version 5) without user-space/kernel-space separation and isolation (all code runs in ring 0) may implement voluntary context switch without syscall or software interrupts; but by usual call to scheduler.

Useful links:

https://en.wikibooks.org/wiki/X86_Assembly/Advanced_Interrupts
http://wiki.osdev.org/Interrupt
http://wiki.osdev.org/Context_Switching
http://www.informit.com/articles/article.aspx?p=364068 - How Multitasking Works at the Hardware Level, 2005, chapter of very good book "Unabridged Pentium 4, The: IA32 Processor Genealogy".
Task management on x86 mentions "Protected Mode Software Architecture" book and "Understanding the Linux Kernel, 3rd Ed.", Bovet, chapter 3 Processes, 3.3.2. Task State Segment.
http://wiki.osdev.org/Task_State_Segment of hardware task switching

How are x86 processors 'aware' of multiple processes being run?

From the hardware's perspective, you are right that all a CPU does is "execute instructions", one at a time. In some (perhaps mildly simplified) sense, that's all that's going on.

If you had some specific computational task to perform, you could indeed write a suitable stream of instructions so that you can power on your hardware, it executes your instructions, and then halts, or shuts down, or whatever. That's how very early computers were in fact operated.

However, this mode of operating a computer is extremely unwieldy and doesn't scale at all, since it requires a single operator to take responsibility of everything, and in the process reinventing all sorts of wheels. That's where the concept of an operating system comes in: The OS is a specific kind of instruction stream that's loaded at start-up which can in turn load up and execute other bits of instructions, dynamically. This compartmentalization allows for reuse of core functionality (think device drivers), and to adapt the functionality of the machine dynamically (i.e. while it's running, as opposed to reprogramming and resetting it). Morever, it allows those parts of the instructions that are loaded dynamically to be authored by different people, so that we have a single platform that can execute "user-defined instructions", i.e. what we conventionally understand as a "program".

So now we have all the pieces: The code that the CPU executes when it is powered up is the operating system, and the operating system dynamically manages the execution of further code. Most of these units of execution are called processes. (But not all code is like this. For example, loadable kernel modules in Linux are dynamically loaded, but don't constitute a process.) That is, a process is an abstract concept in the operating system that delineates between the OS's own code and the "hosted" code that it runs on request.

Depending on the type of OS, execution of processes may have fancy features such as virtual memory (each process sees its own, separate memory) and protection (no process can interfere with the operation of the OS or of other processes). OSes implement such features by using CPU features: a memory manager unit that provides address translation, and protection rings that limit the instructions available to a piece of execution. Not all OSes do this, though; in DOS, for instance, every process has full access to the physical memory and thus to the OS's state. Regardless, an OS typically provides an API for processes (e.g. "system calls"), again using hardware features (interrupts or special system call instructions), and user code generally interacts with the environment through this API rather than talking to peripherals directly. For example, this means that hardware drivers are only implemented by the OS, and user code can make opaque "print output" calls without having to know the details of the available output devices.

Example: Maybe it's useful to illustrate what processes are on the popular Linux operating system, running on x86 hardware: A new process is started when an existing process (e.g. a shell, or init) calls the clone system call, by raising interrupt 128. The interrupt makes the CPU transfer control to the interrupt handler routine, which was set up by the OS during boot-up. When the interrupt handler is entered, the CPU switches to ring 0, privileged mode. The interrupt handler makes the kernel create the new process, and then transfers control back to the calling process (which implies switching to protection ring 3, unprivileged; processes only ever execute in ring 3). For the creation of the new process, the kernel creates the relevant internal book keeping structures, sets up new page tables in the MMU, and then transfers control to the entry point for the clone call, similar to the way the original call returns. (I'm glossing over issues of scheduling here; only one transfer of control happens at a time, and the others are "scheduled" to happen later.) The fact that a new process exists now is merely reflected in the kernel's internal bookkeeping data. The CPU doesn't know anything about it; all it sees is that interrupts get fired and page tables get changed regularly.

How to execute powershell(x86) with schedule task?

Just call the x86 version of Powershell using
%SystemRoot%\syswow64\WindowsPowerShell\v1.0\powershell.exe

Can the TFS PowerShell on Target Machines task execute in x86 mode?

SysWow64 technology allows you to execute 32-bit apps in a 64-bit environment.
On your target machine, please execute PowerShell scripts using the below the app

%SystemRoot%\syswow64\WindowsPowerShell\v1.0\powershell.exe

This would be the 32-bit version of PowerShell.exe and will let you execute your files.

In your batch file, you can make this configuration

Register usage tracking x86

Step 1 is easy, just a lot of work to actually implement (requires parsing of x86 machine code), there are some libraries that may help with that.

Step 2 is not as easy as you make it sound, because of control flow. Liveness analysis is, under the assumption that control flow is simple, is a well known problem. There are some extra problems in this case though.

A procedure may be called whose input registers have not been determined yet. That's mostly due to unlucky ordering, processing procedures depth-first reduces the problem. But there is not necessarily any order for the procedures such that this problem does not arise, because of (mutual) recursion. You may do interprocedural liveness analysis to solve that.
An unknown procedure may be called (virtual methods or explicit calls through function pointers). I don't think you can do much except make a maximally pessimistic assumption.
Even intraprocedural control flow may be unpredictable, such as when a switch is compiled to a jump table. Actually it's not guaranteed that variable jumps will be interprocedural in the first place, you can probably mostly assume that.. I really don't know what to do about that.

I'll then analyze the each block assuming that all of the registers are in use at the start.

That's a huge waste, which can easily be avoided by doing proper liveness analysis.

Task Management on X86