Why Does The Linker Modify a -Defsym "Absolute Address"

linkers, absolute references, and a processes address space

The linker deals in virtual addresses. The absolute address, is the absolute virtual address.

Each process instance will have the exact same virtual address space.

does the loader modify relocation information on program startup?

A program is normally composed of several modules created by a linker. There is the executable and usually a number of shared libraries. On some systems one executable can load another executable and call it's starting routine as a function.

If all these compiled uses had fixed addresses, it is likely there would be conflicts upon loading. If two linked modules used the same address, the application could not load.

For decades, relocatable code has been the solution to that problem. A module can be loaded anywhere. Some system take this to the next step and randomly place modules in memory for security.

There are some situations where code cannot be purely relocatable.

If you have something like this:

static int b, *a = &b ;

the initialization depends upon where the model is placed in memory (and where "b" is located). Linkers usually generate information for such constructs so that the loader can fix them up.

Thus, this is not correct:

I have always believed that resolving absolute addresses is completely the linker's job.

does the dynamic linker modify the reference after executable is copied to memory?

Fairly close. The dynamic linker will modify something in the data segment, not specifically the .data section - segments are a coarser-grained thing corresponding to how the file is mapped into memory rather than the original semantic breakdown. The actual section is usually called .got or .got.plt but may vary by platform. The modification is not "relocating it to the instruction address" but resolving a relocation reference to the function name to get the address it was loaded at, and filling that address in.

How does the `--defsym` linker flag work to pass values to source code?

Linkers see globals as addresses (pointers to the "actual" global rather than the actual global -- even if the actual global doesn't exist at that address). -Xlinker --defsym -Xlinker __BUILD_DATE=$(date +%Y%m%d) sets the address of __BUILD_DATE not the value. When the __BUILD_DATE linker entity has an address but not a value, you can get the address by declaring the entity as anything and then taking the address of that.

In:

#include <stdio.h>

//extern long __BUILD_DATE[128];
//extern int __BUILD_DATE;
extern char __BUILD_DATE;

int main(void)
{
printf("Build date: %lu\n", (unsigned long)&__BUILD_DATE);
}

Any of the three declarations should work. Just don't try to use the value of that (pseudo) global. That would be like dereferencing an invalid pointer.

That should answer 2 and 3. To answer 1, -Xlinker --defsym -Xlinker __BUILD_DATE=$(date +%Y%m%d) stores the number returned (stdout) by $(date %Y%m%d) as the address of __BUILD_DATE. It doesn't store a string.

How to specify base addresses for sections when linking or alternatively how to rebase a section after linking?

Judging by the question you reference and the tag of Linux, I am going to assume that you are using GNU ld.

The short answer for GNU ld is yes, sections can be placed at specific addresses.

The longer answer is that you will need to create a custom linker script to do that, which can be specified the -T for ld. If you are using gcc as a wrapper around ld, you will need pass it the linker via the gcc -Wl, option.

The linker script will have to include something like the following:

SECTIONS {
.text 0x08049000 :
{
foo.o (.text)
bar.o (.text)
}
}

Something to watch out for though is that -T option replaces the default linker script that ld uses. You may want to modify the default linker script to do what you want. The default linker script can be dumped by passing the --verbose option to ld without any other options.

More info about linker scripts is available in the LD Manual.

GNU LD: How to override a symbol value (an address) defined by the linker script specified using -T

While waiting for someone to respond, I did resolve the issue. There are few issues with the problem here and I thought of explaining my findings for someone who might do the same mistake.

First of all Any options to be passed to the linker must be specified with -Xlinker or with -Wl. Hence both 2 and 3 won't work in the above case. The corrected 2 and 3 would be as follows:

  1. Is correct already

  2. gcc $(OBJS) -l$(Lib1) -l$(Lib2) -nostdlib -lgcc -L$(library_path) -g -msmall-mode -mconst-switch-tables -mas-mode -mno-initc -Wl,--start-group,--end-group,-T,$(PATH_TO_Linker.ld),--gc-sections -Xlinker --defsym=SYMBOL_RAM_START=$(VALUE_TO_OVERRIDE) -o$(OUTPUT).elf

  3. gcc $(OBJS) -l$(Lib1) -l$(Lib2) -Xlinker --defsym=SYMBOL_RAM_START=$(VALUE_TO_OVERRIDE) -nostdlib -lgcc -L$(library_path) -g -msmall-mode -mconst-switch-tables -mas-mode -mno-initc -Wl,--start-group,--end-group,-T,$(PATH_TO_Linker.ld),--gc-sections -o$(OUTPUT).elf

Now for the case of options 1 & 2 above, --defsym comes after linker script and SYMBOL_RAM_START was already defined by the linker script. It does override it. But the overriden value will not be used, because the sections have already been defined as the linker script has already been used.

For the case of option 3 above, the SYMBOL_RAM_START was defined before the linker script was read by the linker. Hence when linker script is parsed, the value specified in the script overrides it.

Solution:

In order for this to work, the linker script will need to conditionally initialize the symbol SYMBOL_RAM_START, something like below:

SYMBOL_RAM_START = DEFINED( SYMBOL_RAM_START ) ? SYMBOL_RAM_START : DEFAULT_VALUE ;

Given the above in the linker script, when the SYMBOL_RAM_START was defined before the linker script is included (as shows in option 3 above) it did work. But in the end I had to patch the linker script.

This solution doesn't really override the symbol, but provides a way in which a symbol can be defined so that it can be overridden.



Related Topics



Leave a reply



Submit