How _Libc_Start_Main@Plt Works

how __libc_start_main@plt works?

80482e0:       ff 25 10 a0 04 08       jmp    *0x804a010

This means "retrieve the 4-byte address stored at 0x804a010 and jump to it."

804a010:       e6 82                   out    %al,$0x82
804a012:       04 08                   add    $0x8,%al

Those 4 bytes will be treated as an address, 0x80482e6, not as instructions.

80482e0:       ff 25 10 a0 04 08       jmp    *0x804a010
80482e6:       68 08 00 00 00          push   $0x8
80482eb:       e9 d0 ff ff ff          jmp    80482c0 <_init+0x2c>

So we've just executed an instruction that has moved us exactly one instruction forward. At this point, you're probably wondering if there's a good reason for this.

There is. This is a typical PLT/GOT implementation. Much more detail, including a diagram, is at Position Independent Code in shared libraries: The Procedure Linkage Table.

The real code for __libc_start_main is in a shared library, glibc. The compiler and compile-time linker don't know where the code will be at run-time, so they place in your compiled program a short __libc_start_main function which contains just three instructions:

jump to a location specified by the 4th (or 5th, depending on whether you like to count from 0 or 1) entry in the GOT
push $8 onto the stack
jump to a resolver routine

The first time you call __libc_start_main, the resolver code will run. It will find the actual location of __libc_start_main in a shared library and will patch the 4th entry of the GOT to be that address. If your program calls __libc_start_main again, the jmp *0x804a010 instruction will take the program directly to the code in the shared library.

Can anyone inform me good material for me?

The x86 Assembly book at Wikibooks might be one place to start.

What's going on in __libc_start_main?

The first block, ending in "@plt", is the procedure linkage table (https://stackoverflow.com/a/5469334/994153). The jmp *0x8049658 is an indirect branch instruction, so it actually is jumping to __libc_start_main wherever it actually ends up getting loaded in RAM at runtime.

The real RAM address of __libc_start_main is found in the DYNAMIC RELOCATION RECORDS table, which is created in RAM by the dynamic loader when the program is loaded.

Understanding assembly language _start label in a C program

Here is the well commented assembly source of the code you posted.

Summarized, it does the following things:

establish a sentinel stack frame with ebp = 0 so code that walks the stack can find its end easily
Pop the number of command line arguments into esi so we can pass them to __libc_start_main
Align the stack pointer to a multiple of 16 bits in order to comply with the ABI. This is not guaranteed to be the case in some versions of Linux so it has to be done manually just in case.
The addresses of __libc_csu_fini, __libc_csu_init, the argument vector, the number of arguments and the address of main are pushed as arguments to __libc_start_main
__libc_start_main is called. This function (source code here) sets up some glibc-internal variables and eventually calls main. It never returns.
If for any reason __libc_start_main should return, a hlt instruction is placed afterwards. This instruction is not allowed in user code and should cause the program to crash (hopefully).
The final series of nop instructions is padding inserted by the assembler so the next function starts at a multiple of 16 bytes for better performance. It is never reached in normal execution.

How to override the entrypoint like __libc_start_main with g++

I'm able to use __attribute__((constructor)) instead of overwriting __libc_start_main to add logic before main function.

This could be the different of implementation between g++ and gcc and we may not overwrite it with the current implementation of g++. Please comment and notify me if I'm wrong.

ELF-Binary compiled by gcc: What happens from entry point to main?

I found a nice blog article about the topic: https://web.archive.org/web/20130325140610/http://bharathi.posterous.com/bash-prompt-to-main-call

Short answer: __libc_start_main() is a libc function, which calls the main function (and does a lot of other things). The address will be linked at startup (see BlackBears link), that's why following the steps from the program entry to the main function by static debugging isn't possible.

But you can figure out the address of the main function through the push before __libc_start_main is called.

0x8048417 <_start+23>: push 0x80484b4

@BlackBear: Thank you for the link!

How to disassemble the main function of a stripped application?

Ok, here a big edition of my previous answer. I think I found a way now.

You (still :) have this specific problem:

(gdb) disas main
No symbol table is loaded.  Use the "file" command.

Now, if you compile the code (I added a return 0 at the end), you will get with gcc -S:

    pushq   %rbp
    movq    %rsp, %rbp
    movl    $.LC0, %edi
    call    puts
    movl    $0, %eax
    leave
    ret

Now, you can see that your binary gives you some info:

Striped:

(gdb) info files
Symbols from "/home/beco/Documents/fontes/cpp/teste/stackoverflow/distrip".
Local exec file:
    `/home/beco/Documents/fontes/cpp/teste/stackoverflow/distrip', file type elf64-x86-64.
    Entry point: 0x400440
    0x0000000000400238 - 0x0000000000400254 is .interp
    ...
    0x00000000004003a8 - 0x00000000004003c0 is .rela.dyn
    0x00000000004003c0 - 0x00000000004003f0 is .rela.plt
    0x00000000004003f0 - 0x0000000000400408 is .init
    0x0000000000400408 - 0x0000000000400438 is .plt
    0x0000000000400440 - 0x0000000000400618 is .text
    ...
    0x0000000000601010 - 0x0000000000601020 is .data
    0x0000000000601020 - 0x0000000000601030 is .bss

The most important entry here is .text. It is a common name for a assembly start of code, and from our explanation of main bellow, from its size, you can see that it includes main. If you disassembly it, you will see a call to __libc_start_main. Most important, you are disassembling a good entry point that is real code (you are not misleading to change DATA to CODE).

disas 0x0000000000400440,0x0000000000400618
Dump of assembler code from 0x400440 to 0x400618:
   0x0000000000400440:  xor    %ebp,%ebp
   0x0000000000400442:  mov    %rdx,%r9
   0x0000000000400445:  pop    %rsi
   0x0000000000400446:  mov    %rsp,%rdx
   0x0000000000400449:  and    $0xfffffffffffffff0,%rsp
   0x000000000040044d:  push   %rax
   0x000000000040044e:  push   %rsp
   0x000000000040044f:  mov    $0x400540,%r8
   0x0000000000400456:  mov    $0x400550,%rcx
   0x000000000040045d:  mov    $0x400524,%rdi
   0x0000000000400464:  callq  0x400428 <__libc_start_main@plt>
   0x0000000000400469:  hlt
   ...

   0x000000000040046c:  sub    $0x8,%rsp
   ...
   0x0000000000400482:  retq   
   0x0000000000400483:  nop
   ...
   0x0000000000400490:  push   %rbp
   ..
   0x00000000004004f2:  leaveq 
   0x00000000004004f3:  retq   
   0x00000000004004f4:  data32 data32 nopw %cs:0x0(%rax,%rax,1)
   ...
   0x000000000040051d:  leaveq 
   0x000000000040051e:  jmpq   *%rax
   ...
   0x0000000000400520:  leaveq 
   0x0000000000400521:  retq   
   0x0000000000400522:  nop
   0x0000000000400523:  nop
   0x0000000000400524:  push   %rbp
   0x0000000000400525:  mov    %rsp,%rbp
   0x0000000000400528:  mov    $0x40062c,%edi
   0x000000000040052d:  callq  0x400418 <puts@plt>
   0x0000000000400532:  mov    $0x0,%eax
   0x0000000000400537:  leaveq 
   0x0000000000400538:  retq

The call to __libc_start_main gets as its first argument a pointer to main(). So, the last argument in the stack just immediately before the call is your main() address.

   0x000000000040045d:  mov    $0x400524,%rdi
   0x0000000000400464:  callq  0x400428 <__libc_start_main@plt>

Here it is 0x400524 (as we already know). Now you set a breakpoint an try this:

(gdb) break *0x400524
Breakpoint 1 at 0x400524
(gdb) run
Starting program: /home/beco/Documents/fontes/cpp/teste/stackoverflow/disassembly/d2 

Breakpoint 1, 0x0000000000400524 in main ()
(gdb) n
Single stepping until exit from function main, 
which has no line number information.
hello 1
__libc_start_main (main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>, 
    init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, 
    stack_end=0x7fffffffdc38) at libc-start.c:258
258 libc-start.c: No such file or directory.
    in libc-start.c
(gdb) n

Program exited normally.
(gdb)

Now you can disassembly it using:

(gdb) disas 0x0000000000400524,0x0000000000400600
Dump of assembler code from 0x400524 to 0x400600:
   0x0000000000400524:  push   %rbp
   0x0000000000400525:  mov    %rsp,%rbp
   0x0000000000400528:  sub    $0x10,%rsp
   0x000000000040052c:  movl   $0x1,-0x4(%rbp)
   0x0000000000400533:  mov    $0x40064c,%eax
   0x0000000000400538:  mov    -0x4(%rbp),%edx
   0x000000000040053b:  mov    %edx,%esi
   0x000000000040053d:  mov    %rax,%rdi
   0x0000000000400540:  mov    $0x0,%eax
   0x0000000000400545:  callq  0x400418 <printf@plt>
   0x000000000040054a:  mov    $0x0,%eax
   0x000000000040054f:  leaveq 
   0x0000000000400550:  retq   
   0x0000000000400551:  nop
   0x0000000000400552:  nop
   0x0000000000400553:  nop
   0x0000000000400554:  nop
   0x0000000000400555:  nop
   ...

This is primarily the solution.

BTW, this is a different code, to see if it works. That is why the assembly above is a bit different. The code above is from this c file:

#include <stdio.h>

int main(void)
{
    int i=1;
    printf("hello %d\n", i);
    return 0;
}

But!

if this does not work, then you still have some hints:

You should be looking to set breakpoints in the beginning of all functions from now on. They are just before a ret or leave. The first entry point is .text itself. This is the assembly start, but not the main.

The problem is that not always a breakpoint will let your program run. Like this one in the very .text:

(gdb) break *0x0000000000400440
Breakpoint 2 at 0x400440
(gdb) run
Starting program: /home/beco/Documents/fontes/cpp/teste/stackoverflow/disassembly/d2 

Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) n
Single stepping until exit from function _start, 
which has no line number information.
0x0000000000400428 in __libc_start_main@plt ()
(gdb) n
Single stepping until exit from function __libc_start_main@plt, 
which has no line number information.
0x0000000000400408 in ?? ()
(gdb) n
Cannot find bounds of current function

So you need to keep trying until you find your way, setting breakpoints at:

From the other answer, we should keep this info:

In the non-striped version of the file, we see:

(gdb) disas main
Dump of assembler code for function main:
   0x0000000000400524 <+0>: push   %rbp
   0x0000000000400525 <+1>: mov    %rsp,%rbp
   0x0000000000400528 <+4>: mov    $0x40062c,%edi
   0x000000000040052d <+9>: callq  0x400418 <puts@plt>
   0x0000000000400532 <+14>:    mov    $0x0,%eax
   0x0000000000400537 <+19>:    leaveq 
   0x0000000000400538 <+20>:    retq   
End of assembler dump.

Now we know that main is at 0x0000000000400524,0x0000000000400539. If we use the same offset to look at the striped binary we get the same results:

(gdb) disas 0x0000000000400524,0x0000000000400539
Dump of assembler code from 0x400524 to 0x400539:
   0x0000000000400524:  push   %rbp
   0x0000000000400525:  mov    %rsp,%rbp
   0x0000000000400528:  mov    $0x40062c,%edi
   0x000000000040052d:  callq  0x400418 <puts@plt>
   0x0000000000400532:  mov    $0x0,%eax
   0x0000000000400537:  leaveq 
   0x0000000000400538:  retq   
End of assembler dump.

So, unless you can get some tip where the main starts (like using another code with symbols), another way is if you can have some info about the firsts assembly instructions, so you can disassembly at specifics places and look if it matches. If you have no access at all to the code, you still can read the ELF definition to understand how many sections should appear in the code and try a calculated address. Still, you need info about sections in the code!

That is hard work, my friend! Good luck!

Beco

how to get the actual address of `func` from `callq func@PLT`

You can only find out about that at runtime, after the dynamic linker resolves the actual load address.

Warning: What follows is slightly deeper magic ...

To illustrate what's happening use a debugger:

#include <stdio.h>

int main(int argc, char **argv) { printf("Hello, World!\n"); return 0; }

Compile it (gcc -O8 ...). objdump -d on the binary shows (the optimization of printf() being substituted with puts() for a plain string not withstanding ...):

Disassembly of section .init:
[ ... ]
Disassembly of section .plt:

0000000000400408 <__libc_start_main@plt-0x10>:
  400408:  ff 35 a2 04 10 00       pushq  1049762(%rip)        # 5008b0 <_GLOBAL_OFFSET_TABLE_+0x8>>
  40040e:  ff 25 a4 04 10 00       jmpq   *1049764(%rip)        # 5008b8 <_GLOBAL_OFFSET_TABLE_+0x10>
[ ... ]
0000000000400428 <puts@plt>:
  400428:  ff 25 9a 04 10 00       jmpq   *1049754(%rip)   # 5008c8 <_GLOBAL_OFFSET_TABLE_+0x20>
  40042e:  68 01 00 00 00          pushq  $0x1
  400433:  e9 d0 ff ff ff          jmpq   400408 <_init+0x18>
[ ... ]
0000000000400500 <main>:
  400500:  48 83 ec 08             sub    $0x8,%rsp
  400504:  bf 0c 06 40 00          mov    $0x40060c,%edi
  400509:  e8 1a ff ff ff          callq  400428 <puts@plt>
  40050e:  31 c0                   xor    %eax,%eax
  400510:  48 83 c4 08             add    $0x8,%rsp
  400514:  c3                      retq

Now load it into gdb. Then:


$ gdb ./tcc
GNU gdb Red Hat Linux (6.3.0.0-0.30.1rh)
[ ... ]
(gdb) x/3i 0x400428
0x400428:       jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
0x40042e:       pushq  $0x1
0x400433:       jmpq   0x400408
(gdb) x/gx 0x5008c8
0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>:    0x000000000040042e

Notice this value points back to the instruction directly following the first jmpq; this means the puts@plt slot, on first invocation, will simply "fall through" to:


(gdb) x/3i 0x400408
0x400408:       pushq  1049762(%rip)        # 0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>
0x40040e:       jmpq   *1049764(%rip)        # 0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>
0x400414:       nop
(gdb) x/gx 0x5008b0
0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>:     0x0000000000000000
(gdb) x/gx 0x5008b8
0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>:    0x0000000000000000

The function address and argument aren't initialized yet.

This is the state just after program load, but before executing. Now start executing it:

(gdb) break main
Breakpoint 1 at 0x400500
(gdb) run
Starting program: tcc
(no debugging symbols found)
(no debugging symbols found)

Breakpoint 1, 0x0000000000400500 in main ()
(gdb)  x/i 0x400428
0x400428:  jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
(gdb) x/gx 0x5008c8
0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>:    0x000000000040042e

So this hasn't changed yet - but the targets (the GOT contents for the libc initialization) are different now:


(gdb) x/gx 0x5008b0
0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>:     0x0000002a9566b9a8
(gdb) x/gx 0x5008b8
0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>:    0x0000002a955609f0
(gdb) disas 0x0000002a955609f0
Dump of assembler code for function _dl_runtime_resolve:
0x0000002a955609f0 <_dl_runtime_resolve+0>:     sub    $0x38,%rsp
[ ... ]

I.e. at program load time, the dynamic linker will resolve the "init" parts first. It substitutes the GOT references with pointers that redirect into the dynamic linking code.

Therefore, when first calling an external-to-the-binary function through the .plt reference, it'll jump into the linker again. Let it do that, then inspect the program after that - the state has changed again:


(gdb) break *0x0000000000400514
Breakpoint 2 at 0x400514
(gdb) continue
Continuing.
Hello, World!

Breakpoint 2, 0x0000000000400514 in main ()
(gdb) x/i 0x400428
0x400428:  jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
(gdb) x/gx 0x5008c8
0x5008c8 :    0x0000002a956c8870
(gdb) disas 0x0000002a956c8870
Dump of assembler code for function puts:
0x0000002a956c8870 <puts+0>:    mov    %rbx,0xffffffffffffffe0(%rsp)
[ ... ]

So there's your redirect right into libc now - the PLT reference to puts() finally got resolved.

The instructions to the linker where to insert the actual function load addresses (that we've seen it do for _dl_runtime_resolve comes from special sections in the ELF binary:

$ readelf -a tcc
[ ... ]
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
[ ... ]
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
[ ... ]
Dynamic section at offset 0x700 contains 21 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
[ ... ]
Relocation section '.rela.plt' at offset 0x3c0 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000005008c0  000100000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
0000005008c8  000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0

There's more to ELF than just the above, but these three pieces tell the kernel's binary format handler "this ELF binary has an interpreter" (which is the dynamic linker) that needs to be loaded / initialized first, that it requires libc.so.6, and that offsets 0x5008c0 and 0x5008c8 in the program's writeable data section must be substituted by the load addresses for __libc_start_main and puts, respectively, when the step of dynamic linking is actually performed.

How exactly that happens, from ELF's point of view, is up to the details of the interpreter (aka, the dynamic linker implementation).

How _Libc_Start_Main@Plt Works