X86 Assembly, Little Endianness Not Being Followed(Or Is It) (Linux)

Is Little-Endianness a byte order or a bit order in the x86 architecture?

1.3.1 Bit and Byte Order
x86 is little-endian.
In illustrations of data structures in memory, smaller addresses appear toward the bottom of the figure; addresses increase toward the top. Bit positions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit position. IA-32 processors are “little endian” machines; this means the bytes of
a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions.

Sample Image

The terms endian and endianness refer to the convention used to interpret the bytes making up a data word when those bytes are stored in computer memory. In computing, memory commonly stores binary data by organizing it into 8-bit units called bytes. When reading or writing a data word consisting of multiple such units, the order of the bytes stored in memory determines the interpretation of the data word.

Each byte of data in memory has its own address. Big-endian systems store the most significant byte of a word in the smallest address and the least significant byte is stored in the largest address (also see Most significant bit). Little-endian systems, in contrast, store the least significant byte in the smallest address.

The illustration to the right shows an example using the data word "0A 0B 0C 0D" (a set of 4 bytes written out using left-to-right positional, hexadecimal notation) and the four memory locations with addresses a, a+1, a+2 and a+3; then, in big-endian systems, byte 0A is stored in a, 0B in a+1, 0C in a+2 and 0D in a+3. In little-endian systems, the order is reversed with 0D stored in memory address a, 0C in a+1, 0B in a+2, and 0A in a+3.

Sample Image
Sample Image

So, as you can see endianness is always about bytes order not bits.

asm little-endian register/immediate/memory order

Talking about endianness on registers makes no sense, as registers do not have memory addresses.

From your Wikipedia source: "The terms endian and endianness refer to the convention used to interpret the bytes making up a data word when those bytes are stored in computer memory"

gdb: examine stack and little endian clarification

The char type in C on the x86 is 1 8-bit byte long, with an alignment of 1 byte; there is no endianness to take into account. Arrays in C on the x86 go from lower addresses to higher addresses. The w in x/xw means "print 4-byte words", which is fine for showing 32-bit integers and (on 32-bit systems) pointers, but not so great for chars or char arrays. Use x/9xb b or x/9cb b to see the first 9 elements of the char array b, displayed as bytes.

Little endian and push in nasm

The stack grows downward:

Before the push:

****
****
**** <--- ESP

After push DWORD 0x0a656c4f:

****
****
**** -+
0x0A |
0x65 ^
0x6C |
0x4F <--- ESP -+- write(2) four bytes from here

Endianness inside CPU registers

Endianness makes sense only for memory, where each byte have a numeric address. When MSByte of a value is put in higher memory address than the LSByte, it's called Littte endian, and this is the endianness of any x86 processor.

While for integers the distinction between LSByte and MSByte is clear:

    0x12345678
MSB---^^ ^^---LSB

It's not defined for string literals! It's not obvious what part of the WXYZ should be considered LSB or MSB:

1) The most obvious way,

'WXYZ' ->  0x5758595A

would lead to memory order ZYXW.

2) The not not so obvious way, when the memory order should match the order of literals:

'WXYZ' ->  0x5A595857

The assembler have to choose one of them, and apparently it chooses the second.

Is x86-64 machine language big endian?

No, Intel CPUs are little endian: http://en.wikipedia.org/wiki/Endianness

Operations and endianess

This answer may appear a little bit strange to the experienced folks, but what the questioner is looking for is called MOVBE. It does copy the data as is(!)(relating to his arguments) to a register. It is not available on all architectures, but still the best solution to this specific problem. So the answer to

I was wondering: is there any operation which is performed directly on little endian values without being reversed first?

is yes: MOVBE does copy the bytes in the required order.



Related Topics



Leave a reply



Submit