"Call 0X80482F0 <Puts@Plt>"? Just Need Clarification of One Line of Code in a 'Hello World' Program in X86 Assembly

call 0x80482f0 puts@plt? Just need clarification of one line of code in a 'hello world' program in x86 assembly

0x80482f0 is the address of the puts function. To be more precise, it points to the entry for puts() in the program linker table (PLT) - basically just a bunch of JMP <some routine in a so-library>s (it's a little more complex than that, but that's not important for this discussion). The puts function looks for its argument on the stack - ie, at [esp].

You may be wondering where that puts() call came from - the compiler here was smart enough to see that you didn't actually use any format string parameters in your call to printf(), and replaced that call with a call to the (somewhat faster) puts(). If you'll look closely, you'll see that it also removed the newline from your string, because puts() appends a newline after printing the string it is given.

What exactly does puts@plt mean?

PLT means Procedure Linkage Table. It is a special technique used in ELF files to localize fixing up at load time on machines where relative addressing is available.

The function you're calling is located in another module (typically, libc.so.x), therefore the actual address of the function must be provided when the program is loaded for execution.

PLT is essentially an area in your executable file (or .so file) where all outstanding references are collected together. They have the form of the target machine's jump instruction with the actual address remaining unfilled. It is up to loader to fill the addresses. The process is called fixing up.

Because the remaining part of your module makes function calls through the PLT using relative addressing, and the offset to the PLT is known at the time of linking, nothing has to be fixed up there. This means that most of your module may continue to be mapped onto the module file instead of swap file.

It has also to be noted that complementary to the PLT is the GOT, Global Offset Table. While PLT is used for function calls, GOT is used for data.

Why is the assembly code generate by hello world in C not having a .code segment nor model tiny like x86 assembly does?

The code does not match your command line. That is neither C (file name) nor C++ code (command line). That is assembly language.

Assembly language varies by tool (masm, tasm, nasm, gas, etc), and is not expected to be compatible nor standard in any way. Not talking about just intel vs at&t, all of the code, and this applies to all targets not just x86, easily seen with ARM and others.

You should try to use the assembler not a C nor C++ compiler as that creates yet another assembly language even though gcc for example will pass the assembly language on to gas it can pre-process it through the C preprocessor creating yet another programming language that is incompatible with the gnu assembler it is fed to.

x86 is the last if ever assembly language/instruction set you want to learn, if you are going to learn it then starting with the 8086/88 is IMO the preferred way, much more understandable despite the nuances. Since this appears to be a class you are stuck with this ISA and cannot chose a better first instruction set. (first, second, third...)

Very much within the x86 world, but also for any other target, expect that the language is incompatible between tools and if it happens to work or mostly work that is a bonus. Likewise there is no reason to assume that any tool will have a "masm compatible" or other mode, simply stating intel vs at&t is only a fraction of the language problem and is in no way expected to make the code port between tools.

Re-write the code for the assembly language used for the assembler is the bottom line.

Difference between rip and eip registers in x86 Assembly

The book is written for the 32-bit x86 architecture, which had 32-bit registers named eax, ebp, eip, etc. Your computer, like most present-day x86 machines, is using the 64-bit amd64 (aka x86-64) architecture, which is designed to be similar to 32-bit x86, but among many other differences has 64-bit registers named rax, rbp, rip, etc.

Although the architectures are similar at a conceptual level, exploitation relies on very specific details. Issues like differences in calling conventions are going to mean that most of this book will not be applicable to 64-bit systems, and is thus obsolete.

If you want, you can test the book's examples on programs compiled for 32-bit mode (gcc -m32).

Description of assembly hello world

I'd recommend that you read Intel's Software Developer's Manual (especially volume 2), and/or some x86 assembly tutorial (like The Art of Assembly.

Breakdown of the code:

1) jmp 115

Jumps to the mov ah,09 instruction, so that the CPU doesn't try to execute the 'Hello world' string as if it was code (the CPU can't tell the difference between code and data).

2) db 'Hello world!$'

Declares a string. The dollar-sign is used as a string terminator by some DOS interrupt functions.

3) -a 115

Tells debug to assemble subsequent code starting at address 115.

4) mov ah, 09

Puts the value 9 in register ah.

5) mov dx, 102

Puts the address of the 'Hello world' string in register dx

6) int 21

Performs interrupt 21h / function 9 (write string). The function number is expected in register ah and the string offset in register dx, which was taken care of by the previous two instructions.

7) int 20

Performs interrupt 20h (terminate program)

What parts of this HelloWorld assembly code are essential if I were to write the program in assembly?

The absolute bare minimum that will work on the platform that this appears to be, is

        .globl main
main:
        pushl   $.LC0
        call    puts
        addl    $4, %esp
        xorl    %eax, %eax
        ret
.LC0:
        .string "Hello world"

But this breaks a number of ABI requirements. The minimum for an ABI-compliant program is

        .globl  main
        .type   main, @function
main:
        subl    $24, %esp
        pushl   $.LC0
        call    puts
        xorl    %eax, %eax
        addl    $28, %esp
        ret
        .size main, .-main
        .section .rodata
.LC0:
        .string "Hello world"

Everything else in your object file is either the compiler not optimizing the code down as tightly as possible, or optional annotations to be written to the object file.

The .cfi_* directives, in particular, are optional annotations. They are necessary if and only if the function might be on the call stack when a C++ exception is thrown, but they are useful in any program from which you might want to extract a stack trace. If you are going to write nontrivial code by hand in assembly language, it will probably be worth learning how to write them. Unfortunately, they are very poorly documented; I am not currently finding anything that I think is worth linking to.

The line

.section    .note.GNU-stack,"",@progbits

is also important to know about if you are writing assembly language by hand; it is another optional annotation, but a valuable one, because what it means is "nothing in this object file requires the stack to be executable." If all the object files in a program have this annotation, the kernel won't make the stack executable, which improves security a little bit.

(To indicate that you do need the stack to be executable, you put "x" instead of "". GCC may do this if you use its "nested function" extension. (Don't do that.))

It is probably worth mentioning that in the "AT&T" assembly syntax used (by default) by GCC and GNU binutils, there are three kinds of lines: A line
with a single token on it, ending in a colon, is a label. (I don't remember the rules for what characters can appear in labels.) A line whose first token begins with a dot, and does not end in a colon, is some kind of directive to the assembler. Anything else is an assembly instruction.

"Call 0X80482F0 <Puts@Plt>"? Just Need Clarification of One Line of Code in a 'Hello World' Program in X86 Assembly