What Is The Equivalent of _Emit on Linux

What is the equivalent of _emit on Linux?

To emit byte 0x12 (for example), do:

asm __volatile__ (".byte 0x12");

Although, you might get surprising results with optimizations enabled.

what is the equivalent of _emit in MASM

You can just use db, as in:

db 10h

You do this most often in a data segment, unless things have changed in the 64-bit version of MASM, it should work in the code segment as well.

How use _emit in clang?

In compilers that support GNU extensions, there's no need for a separate emit keyword, just use GNU C inline assembly:

asm(".byte 0x90");   // implicitly   asm volatile

Or .long to emit a 32-bit constant.

GNU C inline asm is not parsed to detect clobbers or anything, so you could just asm("nop");

If you want to use instructions that modify registers, you normally need to tell the compiler about it with GNU C Extended inline assembly (output/input/clobbers). See https://stackoverflow.com/tags/inline-assembly/info.

emit-llvm in Linux

Something is horrible broken in ubuntu packaging of llvm-gcc. llvm-gcc's version is 4.2.1, but here we're seeing 4.5. Please report Ubuntu bug.

Equivalent of __declspec( naked ) in gcc/g++

I believe there is no such equivalent with a recent GCC under Linux. The compiler emit prologues and epilogues when appropriate, and you should leave that decision to it. It may be quite good at making prologues or epilogues quite small, or even sometimes non-existent.

You could code your function in assembly. Or you can put asm statements inside your function.

And you did not tell why you want to do that. What is your goal, and why precisely are you asking?

C preprocessor and _asm _emit directive

You don't need to mess around with 64-bit code in a 32-bit process to access the 64-bit Process Environment Block. You can obtain its address using 32-bit code, and it's located within the 32-bit address space. You'd only need to use 64-bit code if you need to access memory allocated outside the 32-bit address space, and I don't think Windows will ever do that in a 32-bit process.

If you really did need to have a 64-bit function in a 32-bit executable there's a better way of doing it than using the _asm _emit. The first thing to do is to write your entire 64-bit function in plain assembly and assemble it with an normal external assembler. For example, here's a function that reads from a 64-bit pointer in MASM syntax:

_TEXT   SEGMENT
__read64ptr:
    mov rax, [rsp + 8]
    mov eax, [rax]
    mov edx, [rax + 4]
    retf
_TEXT   ENDS
    END

This simple function takes a 64-bit pointer as an argument on the stack. A 64-bit value located at address pointed to is put into EAX and EDX. This function is meant to called with a 32-bit far call instruction.

Note that the return value takes up two 32-bit stack slots, one for the 32-bit offset of the return address and another for the selector. Despite the fact that the RETF instruction is executed in 64-bit mode, it uses a 32-bit stack size by default (unlike the 64-bit near RET instruction) and will work correctly with the 32-bit far return address saved on the stack.

Unfortunately we can't use this assembly file directly with the tools provided with Visual Studio. The 64-bit version of MASM only creates 64-bit object files, and the linker won't let us mix 32-bit and 64-bit object files. It should be possible to assemble 64-bit code into a 32-bit object using NASM and link with Microsoft's linker, but it's possible to use the code indirectly using only Microsoft's tools.

To do that, assemble the file and copy the machine code manually into a C array that lives in the .text section:

#pragma code_seg(push, ".text")
#pragma code_seg(pop)
char const __declspec(allocate(".text")) _read64ptr[] = {
    0x48, 0x8b, 0x44, 0x24, 0x08,   /* mov rax, [rsp + 8] */
    0x8b, 0x00,                     /* mov eax. [rax] */
    0x8b, 0x50, 0x04,               /* mov edx, [rax + 4] */
    0xcb                            /* retf */
};

To call it you just need to use code like this:

struct {
    void const *offset;
    unsigned short selector;
} const _read64ptr_ind = { _read64ptr, 0x33 };

unsigned long long
read64ptr(unsigned long long address) {
    unsigned long long value;
    _asm {
        push    DWORD PTR [address + 4]
        push    DWORD PTR [address]
        call    FWORD PTR [_read64ptr_ind]
        add     esp, 8
        mov     DWORD PTR [value], eax
        mov     DWORD PTR [value + 4], edx
    }
    return value;
}

The indirection with _read64ptr_ind is necessary because there's no way to write call 33h:_read64ptr in Microsoft inline assembly. Also note that the 64-bit code selector 0x33 is hard coded in this example, hopefully it won't change.

Here's an example that uses the above code to read the address of the 64-bit PEB from the 64-bit TEB (even though both are located in the 32-bit address space):

unsigned long long
readgsqword(unsigned long off) {
    unsigned long long value;
    _asm {
        mov edx, [off]
        mov eax, gs:[edx]
        mov edx, gs:[edx + 4]
        mov DWORD PTR [value], eax
        mov DWORD PTR [value + 4], edx
    }
    return value;
}

int
main() {
    printf("32-bit TEB address %08lx\n",
           __readfsdword(offsetof(NT_TIB, Self)));
    printf("32-bit PEB address %08lx\n", __readfsdword(0x30));
    unsigned long long teb64 = readgsqword(offsetof(NT_TIB64, Self));
    printf("64-bit TEB address %016llx\n", teb64);
    printf("64-bit PEB address %016llx\n", readgsqword(0x60));
    printf("64-bit PEB address %016llx\n", read64ptr(teb64 + 0x60));
}

Running it on my computer generates the following output:

32-bit TEB address 7efdd000
32-bit PEB address 7efde000
64-bit TEB address 000000007efdb000
64-bit PEB address 000000007efdf000
64-bit PEB address 000000007efdf000

As you can see all the structures can be accessed using 32-bit pointers and without any 64-bit code. In particular, the example shows how to obtain a 32-bit pointer to the 64-bit PEB using only 32-bit code.

A final note, there's no guarantee that Windows will handle 64-bit code running in a 32-bit process correctly. If an interrupt happens to occur at any time while executing the 64-bit code the process may end up crashing.

For each line of input, emit that line alongside output from passing it to a command

Don't use xargs for this task at all. It's inefficient with -n 1 (running a new and separate shell process for each line of input), and opens you up to security vulnerabilities (consider if your 100 were $(rm -rf $HOME) with some escaping to make it act as a single word).

The following can run within a single shell instance, assuming that your input file has one value to each line:

while IFS= read -r line; do
  printf '%s %s\n' "$line" "$(abc "$line")"
done <jobs.txt

Is there a (Linux) g++ equivalent to the /fp:precise and /fp:fast flags used in Visual Studio?

Excess register precision is an issue only on FPU registers, which compilers (with the right enabling switches) tend to avoid anyway. When floating point computations are carried out in SSE registers, the register precision equals the memory one.

In my experience most of the /fp:fast impact (and potential discrepancy) comes from the compiler taking the liberty to perform algebraic transforms. This can be as simple as changing summands order:

( a + b ) + c --> a + ( b + c)

can be - distributing multiplications like a*(b+c) at will, and can get to some rather complex transforms - all intended to reuse previous calculations.
In infinite precision such transforms are benign, of course - but in finite precision they actually change the result. As a toy example, try the summand-order-example with a=b=2^(-23), c = 1. MS's Eric Fleegal describes it in much more detail.

In this respect, the gcc switch nearest to /fp:precise is -fno-unsafe-math-optimizations. I think it's on by default - perhaps you can try setting it explicitly and see if it makes a difference. Similarly, you can try explicitly turning off all -ffast-math optimizations: -fno-finite-math-only, -fmath-errno, -ftrapping-math, -frounding-math and -fsignaling-nans (the last 2 options are non default!)

How to tell GCC to emit a specific debugging symbol named by me?

Though I might misunderstand the question,
how about making a dummy function and calling that in SETSTATE,
then setting a breakpoint in that function?

For example:

void dummy_breakpoint() {}

#define SETSTATE(st) dummy_breakpoint(); ...usual process...

Setting break dummy_breakpoint in .gdbinit might help some labor savings.

EDIT:
How about setting a watch-point in SETSTATE like the following, and
setting watch dummy_variable in .gdbinit?

char dummy_variable; /* global variable */

#define SETSTATE(st) ++ dummy_variable; ...usual process...

However, this might make the program's execution be slower if your
environment doesn't provide hardware watch-point...

What Is The Equivalent of _Emit on Linux