Labels in Gcc Inline Assembly

Labels in GCC inline assembly

A declaration of a local label is indeed a number followed by a colon. But a reference to a local label needs a suffix of f or b, depending on whether you want to look forwards or backwards - i.e. 1f refers to the next 1: label in the forwards direction.

So declaring the label as 1: is correct; but to reference it, you need to say jmp 1f (because you are jumping forwards in this case).

How Do I Use Labels In GCC Inline Assembly?

There are plenty of tutorials - including this one (probably the best I know of), and some info on operand size modifiers.

Here's the first implementation - swap_2 :

void swap_2 (int *a, int *b)
{
int tmp0, tmp1;

__asm__ volatile (
"movl (%0), %k2\n\t" /* %2 (tmp0) = (*a) */
"movl (%1), %k3\n\t" /* %3 (tmp1) = (*b) */
"cmpl %k3, %k2\n\t"
"jle %=f\n\t" /* if (%2 <= %3) (at&t!) */
"movl %k3, (%0)\n\t"
"movl %k2, (%1)\n\t"
"%=:\n\t"

: "+r" (a), "+r" (b), "=r" (tmp0), "=r" (tmp1) :
: "memory" /* "cc" */ );
}

A few notes:

  • volatile (or __volatile__) is required, as the compiler only 'sees' (a) and (b) (and doesn't 'know' you're potentially exchanging their contents), and would otherwise be free to optimize the whole asm statement away - tmp0 and tmp1 would otherwise be considered unused variables too.

  • "+r" means that this is both an input and output that may be modified; only it isn't in this case, and they could strictly be input only - more on that in a bit...

  • The 'l' suffix on 'movl' isn't really necessary; neither is the 'k' (32-bit) length modifier for the registers. Since you're using the Linux (ELF) ABI, an int is 32 bits for both IA32 and x86-64 ABIs.

  • The %= token generates a unique label for us. BTW, the jump syntax <label>f means a forward jump, and <label>b means back.

  • For correctness, we need "memory" as the compiler has no way of knowing if values from dereferenced pointers have been changed. This may be an issue in more complex inline asm surrounded by C code, as it invalidates all currently held values in memory - and is often a sledgehammer approach. Appearing at the end of a function in this fashion, it's not going to be an issue - but you can read more on it here (see: Clobbers)

  • The "cc" flags register clobber is detailed in the same section. on x86, it does nothing. Some writers include it for clarity, but since practically all non-trivial asm statements affect the flags register, it's just assumed to be clobbered by default.

Here's the C implementation - swap_1 :

void swap_1 (int *a, int *b)
{
if (*a > *b)
{
int t = *a; *a = *b; *b = t;
}
}

Compiling with gcc -O2 for x86-64 ELF, I get identical code. Just a bit of luck that the compiler chose tmp0 and tmp1 to use the same free registers for temps... cutting out the noise, like the .cfi directives, etc., gives:

swap_2:
movl (%rdi), %eax
movl (%rsi), %edx
cmpl %edx, %eax
jle 21f
movl %edx, (%rdi)
movl %eax, (%rsi)
21:
ret

As stated, the swap_1 code was identical, except that the compiler chose .L1 for its jump label. Compiling the code with -m32 generated the same code (apart from using the tmp registers in a different order). There's more overhead, as the IA32 ELF ABI passes parameters on the stack, whereas the x86-64 ABI passes the first two parameters in %rdi and %rsi respectively.


Treating (a) and (b) as input only - swap_3 :

void swap_3 (int *a, int *b)
{
int tmp0, tmp1;

__asm__ volatile (
"mov (%[a]), %[x]\n\t" /* x = (*a) */
"mov (%[b]), %[y]\n\t" /* y = (*b) */
"cmp %[y], %[x]\n\t"
"jle %=f\n\t" /* if (x <= y) (at&t!) */
"mov %[y], (%[a])\n\t"
"mov %[x], (%[b])\n\t"
"%=:\n\t"

: [x] "=&r" (tmp0), [y] "=&r" (tmp1)
: [a] "r" (a), [b] "r" (b) : "memory" /* "cc" */ );
}

I've done away with the 'l' suffix and 'k' modifiers here, because they're not needed. I've also used the 'symbolic name' syntax for operands, as it often helps to make the code more readable.

(a) and (b) are now indeed input-only registers. So what's the "=&r" syntax mean? The & denotes an early clobber operand. In this case, the value may be written to before we finish using the input operands, and therefore the compiler must choose registers different from those selected for the input operands.

Once again, the compiler generates identical code as it did for swap_1 and swap_2.


I wrote way more than I planned on this answer, but as you can see, it's very difficult to maintain awareness of all the information the compiler must be made aware of, as well as the idiosyncrasies of each instruction set (ISA) and ABI.

Jump to a label from inline assembly to C

This is what asm goto is for. GCC Inline Assembly: Jump to label outside block

Note that defining a label inside another asm statement will sometimes work (e.g. with optimization disabled) but IS NOT SAFE.

    asm("end:");   // BROKEN; NEVER USE 
// except for toy experiments to look at compiler output

GNU C does not define the behaviour of jumping from one asm statement to another without asm goto. The compiler is allowed to assume that execution comes out the end of an asm statement and e.g. put a store after it.


The C end: label within a given function won't just have the asm symbol name of end or _end: - that wouldn't make sense because separate C functions are each allowed to have their own end: label. It could be something like main.end but it turns out GCC and clang just use their usual autonumbered labels like .L123.

Then how this code works: https://github.com/IAIK/transientfail/blob/master/pocs/spectre/PHT/sa_oop/main.c

It doesn't; the end label that asm volatile("je end"); references is in the .data section and happens to be defined by the compiler or linker to mark the end of that section.

asm volatile("je end") has no connection to the C label in that function.

I commented out some of the code in other functions to get it to compile without the "cacheutils.h" header but that didn't affect that part of the oop() function; see https://godbolt.org/z/jabYu3 for disassembly of the linked executable with JE_4k changed to JE_16 so it's not huge. It's disassembly of a linked executable so you can see the numeric address of je 6010f0 <_end> while the oop function itself starts at 4006e0 and ends at 400750. (So it doesn't contain the branch target).

If this happens to work for Spectre exploits, that's because apparently the branch is never actually taken.

GCC Inline Assembly: Jump to label outside block

The code in this answer happens to work, but is undefined behaviour and will in general break things with optimization enabled. It's only safe to jmp out of an inline asm statement with asm goto, or under limited circumstances when the asm statement is followed by __builtin_unreachable();

(That's not usable here: it's never safe to jump into the middle of an inline asm statement and then fall out into compiler-generated code again inside a function.)


What if you define the label with the assembler?

asm("external_label:");

Update: this code seems to work:

#include <stdio.h>

int
main(void)
{
asm("jmp label");
puts("You should not see this.");
asm("label:");

return 0;
}

What's asm labels in C language?


What's asm labels in C language?

It does not exist in C programming language.

It's a GCC extension to the C language, that basically replaces the function name with another function name upon compilation.

This program:

void bar(void);
void func() { bar(); }

Compiles to:

func:
jmp bar

But this program:

void bar(void) asm("somename");
void func() { bar(); }

Compiles to:

func:
jmp somename

I believe, the idea of the commit is that GLIBC code that tests sqrt will not be optimized by the compiler, so that the test code can test the generic implementation not the built-in compiler implementation the compiler uses to optimize.

Define a unique and global assembly label/symbol inside C functions

One problem is that you may get multiple copies of the same label due to inlining. Add the following attribute to functions containing these labels:

__attribute__((noinline))

Also note that you need to mark the symbol as global. Let's extract this into a macro so we can format nicely without changing the value of __LINE__:

#define MAKE_LABEL \
__asm__( \
"GLOBAL_LABEL_" UNIQUE(MYFILE_ID, __LINE__) ":" \
"\n\t.global GLOBAL_LABEL_" UNIQUE(MYFILE_ID, __LINE__) \
)

But the macro-expansion is off. Unfortunately, I cannot explain to you why this works. But here is the correct macro definition:

#define UN(X) #X
#define UNIQUE2(X,Y) UN(X##Y)
#define UNIQUE(X,Y) UNIQUE2(X,Y)

Otherwise you will get __LINE__ instead of, say, 23.

gcc inline asm jump to a label with crossing throwing an exception

You must use asm goto form to specify labels:

#include <iostream>
#include <stdexcept>

int main()
{
int a,b;
std::cin >> a >> b;
asm goto ("movl %0, %%eax;\n\t"
"addl %1, %%eax\n\t"
"jno %l2\n\t;"
:
:"m"(a),"m"(b)
:"%eax"
:L_NO_OVERFLOW);

throw std::overflow_error("overflow");

L_NO_OVERFLOW:
asm("movl %%eax, %0\n\t"
:"=m"(a));
std::cout << a << std::endl;
return 0;
}

Idea is to tell compiler, that your inline assembler clobbers a label, and specify control flow, involving that label, directly.

UPD: Also note you must have rather new gcc to support this feature. Version > 4.5 seems to be okay. I tested on 4.8.1

how to set labels in inline assembly?

Use offset variableName to access variables from inline assembly. See reference here.

Example:

char format[] = "%s %s\n";
char hello[] = "Hello";
char world[] = "world";
int main( void )
{
__asm
{
mov eax, offset world
push eax
mov eax, offset hello
push eax
mov eax, offset format
push eax
call printf
//clean up the stack so that main can exit cleanly
//use the unused register ebx to do the cleanup
pop ebx
pop ebx
pop ebx
}
}

ARM Assembly Local Labels

The important difference is that the numbered local labels can be reused without worry and that is why you need to specify the direction too. You can jump to preceding or following one, but not the ones beyond them.

1: foo
...
1: bar
...
jmp 1b # jumps to bar
...
jmp 1f # jumps to baz
...
1: baz
...
1: qux
...
jmp 1b # jumps to qux

As long as you only use them within a single block only, you can be sure they will work as intended and not conflict with anything else.



Related Topics



Leave a reply



Submit