What Do Linkers Do

What do linkers do?

To understand linkers, it helps to first understand what happens "under the hood" when you convert a source file (such as a C or C++ file) into an executable file (an executable file is a file that can be executed on your machine or someone else's machine running the same machine architecture).

Under the hood, when a program is compiled, the compiler converts the source file into object byte code. This byte code (sometimes called object code) is mnemonic instructions that only your computer architecture understands. Traditionally, these files have an .OBJ extension.

After the object file is created, the linker comes into play. More often than not, a real program that does anything useful will need to reference other files. In C, for example, a simple program to print your name to the screen would consist of:

printf("Hello Kristina!\n");

When the compiler compiled your program into an obj file, it simply puts a reference to the printf function. The linker resolves this reference. Most programming languages have a standard library of routines to cover the basic stuff expected from that language. The linker links your OBJ file with this standard library. The linker can also link your OBJ file with other OBJ files. You can create other OBJ files that have functions that can be called by another OBJ file. The linker works almost like a word processor's copy and paste. It "copies" out all the necessary functions that your program references and creates a single executable. Sometimes other libraries that are copied out are dependent on yet other OBJ or library files. Sometimes a linker has to get pretty recursive to do its job.

Note that not all operating systems create a single executable. Windows, for example, uses DLLs that keep all these functions together in a single file. This reduces the size of your executable, but makes your executable dependent on these specific DLLs. DOS used to use things called Overlays (.OVL files). This had many purposes, but one was to keep commonly used functions together in 1 file (another purpose it served, in case you're wondering, was to be able to fit large programs into memory. DOS has a limitation in memory and overlays could be "unloaded" from memory and other overlays could be "loaded" on top of that memory, hence the name, "overlays"). Linux has shared libraries, which is basically the same idea as DLLs (hard core Linux guys I know would tell me there are MANY BIG differences).

Hope this helps you understand!

How does a linker work exactly (microcontroller context)?

printf is insanely complicated, very bad for a microcontroller hello world example, blinking leds are better but that gets specific to the microcontroller. this will suffice for linking.

two.c

unsigned int glob;
unsigned int two ( unsigned int a, unsigned int b )
{
    glob=5;
    return(a+b+7);
}

one.c

extern unsigned int glob;
unsigned int two ( unsigned int, unsigned int );
unsigned int one ( void )
{
    return(two(5,6)+glob);
}

start.s

.globl _start
_start:
    bl one
    b .

build everything.

% arm-none-eabi-gcc -O2 -c one.c -o one.o
% arm-none-eabi-gcc -O2 -c two.c -o two.o
% touch start.s
% arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -c one.c -o one.o
% arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -c two.c -o two.o
% arm-none-eabi-as start.s -o start.o
% arm-none-eabi-ld -Ttext=0x10000000 start.o one.o two.o -o onetwo.elf

now lets look...

arm-none-eabi-objdump -D start.o
...
00000000 <_start>:
   0:   ebfffffe    bl  0 <one>
   4:   eafffffe    b   4 <_start+0x4>

it not is the compiler/assemblers job to deal with external references so the branch link to one is left incomplete, they chose to make it a bl of 0 but they could have simply left it totally unencoded, it is up to the authors of the toolchain as to how to communicate between the compiler, assembler, and linker via object files.

Same here

00000000 <one>:
   0:   e92d4008    push    {r3, lr}
   4:   e3a00005    mov r0, #5
   8:   e3a01006    mov r1, #6
   c:   ebfffffe    bl  0 <two>
  10:   e59f300c    ldr r3, [pc, #12]   ; 24 <one+0x24>
  14:   e5933000    ldr r3, [r3]
  18:   e0800003    add r0, r0, r3
  1c:   e8bd4008    pop {r3, lr}
  20:   e12fff1e    bx  lr
  24:   00000000    andeq   r0, r0, r0

both the function two and the address for the global variable glob are unknown. Note that for the unknown variable the compiler generates code that requires the explicit address of the global so that the linker simply needs to fill in the address, also glob is .data not .text.

00000000 <two>:
   0:   e59f3010    ldr r3, [pc, #16]   ; 18 <two+0x18>
   4:   e2811007    add r1, r1, #7
   8:   e3a02005    mov r2, #5
   c:   e0810000    add r0, r1, r0
  10:   e5832000    str r2, [r3]
  14:   e12fff1e    bx  lr
  18:   00000000    andeq   r0, r0, r0

here too the global is in .data not here, so the linker will have to place .data and the things in it and then fill in the addresses.

so here we have linked it all together, the gnu linker requires an entry point label defined _start (main is an extern address required by the standard bootstrap, which I am not using so we dont get a main not found error). Because I am not using a linker script the gnu linker places items in the binary in the order they were defined on the command line, as desired i need start first for a microcontroller since I am controlling the boot. I used a non-zero here for demonstration purposes as well...

10000000 <_start>:
10000000:   eb000000    bl  10000008 <one>
10000004:   eafffffe    b   10000004 <_start+0x4>

10000008 <one>:
10000008:   e92d4008    push    {r3, lr}
1000000c:   e3a00005    mov r0, #5
10000010:   e3a01006    mov r1, #6
10000014:   eb000005    bl  10000030 <two>
10000018:   e59f300c    ldr r3, [pc, #12]   ; 1000002c <one+0x24>
1000001c:   e5933000    ldr r3, [r3]
10000020:   e0800003    add r0, r0, r3
10000024:   e8bd4008    pop {r3, lr}
10000028:   e12fff1e    bx  lr
1000002c:   1000804c    andne   r8, r0, ip, asr #32

10000030 <two>:
10000030:   e59f3010    ldr r3, [pc, #16]   ; 10000048 <two+0x18>
10000034:   e2811007    add r1, r1, #7
10000038:   e3a02005    mov r2, #5
1000003c:   e0810000    add r0, r1, r0
10000040:   e5832000    str r2, [r3]
10000044:   e12fff1e    bx  lr
10000048:   1000804c    andne   r8, r0, ip, asr #32

Disassembly of section .bss:

1000804c <__bss_start>:
1000804c:   00000000    andeq   r0, r0, r0

so the linker starts to place the first item start.o, it roughly figures out how big that needs to be by just putting what was there. those two instructions. they take 8 bytes so in theory the second item one.o goes next at 0x10000008. That means the encoding for the bl one in start.s can be completed to use the correct relative address (_start + 8 which is the value of the pc when executing so the offset is zero, pc+0 is the encoding)

the linker has roughly placed one.o into the binary it is building and it has to resolve the address to two and the global so it has to place two.o and then figure out where the end of that is to place in this case .bss not .data since I didnt pre-init the variable.

the label for two is at 0x10000030 so it encodes the bl two in one() for that relative offset, it has also placed glob at 1000804c for some reason (I didnt complete define where ram was so the gnu linker will do things like this). Despite the reason, that is where the linker defined the home for that global variable and where the address to glob is needed is filled in by the linker, both one() and two() needed those filled in.

So the compiler (assembler) and linker have to in the end result in a usable binary, the compiler (assembler) tend to worry about making position independent machine code and leave enough information for the linker so that it has the machine code and a list of unresolved externs that it has to fill in. compilers have improved over time, a simple model would be to have an address location like they did above for the global variables address, where the linker computes the absolute address and just fills it in, clearly above they did not encode the function call in a way that it can use an absolute address to one and two. instead it uses pc relative addressing. This means that the linker has to know the machine code encoding of the bl instruction. the current generation of gnu linker knows quite a bit more and can do some cool things resolving arm to thumb and back, stuff it didnt used to know (you dont need to compile for thumb interwork anymore the linker takes care of it).

So the linker takes binary blobs including data and...links them together into one binary. It first needs to know the actual addresses for the various things in the binary. How you tell the linker this is linker specific and not a global thing for all C/C++ toolchains. Gnu linker scripts are a programming language in and of themselves. These are not necessarily physical nor virtual addresses it is simply the address space of the code in whatever mode it is in (virtual or physical). Once the linker knows the addresses it, based on linker rules (again linker specific) it starts placing these various binary blobs into those address spaces. then it goes through and resolves the external/global addresses. It was not above but can be an iterative process. If for example the function two() was at an address in memory that cannot be accessed with a single pc relative instruction (say we put one near zero and two near 0xF0000000) then those that wrote the linker have two choices, the simple choice is to simply state that it cannot encode/implement that far of a branch and bail out and gnu linker did or still does do that. Or the other solution is the linker fixes the problem. the linker could add a few words of data within the range of the pc relative branch link and those few words of data are a trampoline for example an absolute address that is loaded into a register then a register based branch or perhaps of clever a pc relative branch if the trampoline is within range (in the case of 0x10000000 to 0xF0000000 that wouldnt work). If the linker has to add these few words then that may mean that some of the binary blobs have to move to make room for those few words and now all of the addresses in those binary blobs now have to move as well. So you have to make another pass across all the binary blobs, resolving all of the new addresses filling in the answers and for pc relative determining if you can still reach everything. Adding those few words might have made something that was reachable with a pc-relative now unreachable and now that requires a solution (error or patch).

The assembler itself for a single source file has to go through even more of these gyrations esp for a variable length instruction set like x86 where the addressing is a big vague. I recommend trying for yourself to make a simple assembler that only supports a few instructions but some of those branches. and parse and encode the instructions and compare that to an existing debugged assembler like gnu assembler.

test.s

   ldr r1,locdat
   nop
   nop
   nop
   nop
   nop
   b over
locdat: .word 0x12345678
top:
    nop
    nop
    nop
    nop
    nop
    nop
over:
    b top

the right answer is

00000000 <locdat-0x1c>:
   0:   e59f1014    ldr r1, [pc, #20]   ; 1c <locdat>
   4:   e1a00000    nop         ; (mov r0, r0)
   8:   e1a00000    nop         ; (mov r0, r0)
   c:   e1a00000    nop         ; (mov r0, r0)
  10:   e1a00000    nop         ; (mov r0, r0)
  14:   e1a00000    nop         ; (mov r0, r0)
  18:   ea000006    b   38 <over>

0000001c <locdat>:
  1c:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

00000020 <top>:
  20:   e1a00000    nop         ; (mov r0, r0)
  24:   e1a00000    nop         ; (mov r0, r0)
  28:   e1a00000    nop         ; (mov r0, r0)
  2c:   e1a00000    nop         ; (mov r0, r0)
  30:   e1a00000    nop         ; (mov r0, r0)
  34:   e1a00000    nop         ; (mov r0, r0)

00000038 <over>:
  38:   eafffff8    b   20 <top>

there are parallels to that activity and the job of a linker. also you could fashion a simple linker based on the above files or something similar, extract the binary blobs and other info and start placing them in whatever address space you want.

Either one are fairly simple programming tasks, yet fairly educational. Having an existing toolchain that can produce the answer you can figure out where you are going wrong or how to get at the right answer.

Does linker link an object file with itself?

It depends. Both options are possible, so are options that you didn't mention, like either the compiler or the linker rearranging the code so that none of the functions exist any more. It's fine thinking about compilers emitting references to functions and linkers resolving those references as a way of understanding C++, but bear in mind is that all the compiler and linker have to do is produce a working program and there are many different ways to do that.

One thing the compiler and linker must do however, is make sure that any calls to standard library functions happen (like printf as you mentioned), and happen in the order that the C++ source specifies. Apart from that (and some other similar concerns) they can more or less do as they wish.

What Do Linkers Do