Why do I need to use [ ] (square brackets) when moving data from register to memory, but not when other way around?
Using brackets and not using brackets are basically two different things:
A bracket means that the value in the memory at the given address is meant.
An expression without a bracket means that the address (or value) itself is meant.
Examples:
mov ecx, 1234
Means: Write the value 1234 to the register ecx
mov ecx, [1234]
Means: Write the value that is stored in memory at address 1234 to the register ecx
mov [1234], ecx
Means: Write the value stored in ecx to the memory at address 1234
mov 1234, ecx
... makes no sense (in this syntax) because 1234 is a constant number which cannot be changed.
Linux "write" syscall (INT 80h, EAX=4) requires the address of the value to be written, not the value itself!
This is why you do not use brackets at this position!
What do square brackets mean in x86 assembly?
Let's make a very simple example and imagine we have a CPU with only two registers, EAX and EBX.
mov ebx, eax
Simply copies the value in eax
to the ebx
register
| EAX : 01234567 | ----> | EAX : 01234567 |
| EBX : 00000000 | ====> | EBX : 01234567 |
Now let's add some memory space
ADDRESS VALUE
00000000 6A43210D
00000004 51C9A847
00000008 169B87F1
0000000C C981A517
00000010 9A16D875
00000014 54C9815F
mov [ebx], eax
Moves the value in eax
to the memory address contained in ebx
.
| EAX : 01234567 | --no--> | EAX : 01234567 |
| EBX : 00000008 | --change--> | EBX : 00000008 |
ADDRESS VALUE
00000000 6A43210D -> 6A43210D
00000004 51C9A847 -> 51C9A847
00000008 169B87F1 =====> 01234567
0000000C C981A517 -> C981A517
00000010 9A16D875 -> 9A16D875
00000014 54C9815F -> 54C9815F
mov ebx, [eax]
Moves the value from the memory address contained in eax
to ebx
.
| EAX : 00000008 | -> | EAX : 00000008 |
| EBX : 01234567 | ====> | EBX : 169B87F1 |
[No change to memory]
ADDRESS VALUE
00000000 6A43210D
00000004 51C9A847
00000008 169B87F1
0000000C C981A517
00000010 9A16D875
00000014 54C9815F
mov [ebx], [eax]
This, finally, you would think would move the value from the memory address contained in eax
to the memory address contained in ebx
.
| EAX : 00000010 | --no--> | EAX : 00000010 |
| EBX : 00000008 | --change--> | EBX : 00000008 |
ADDRESS VALUE
00000000 6A43210D -> 6A43210D
00000004 51C9A847 -> 51C9A847
00000008 169B87F1 =====> 9A16D875
0000000C C981A517 -> C981A517
00000010 *9A16D875 -> 9A16D875
00000014 54C9815F -> 54C9815F
But this combination is disallowed by the x86 architecture. You cannot move from memory to memory.
The use of brackets is therefore equivalent to a dereferencing operation.
What does adding two registers in square brackets mean?
This will add the values of the two registers and subsequently use them as a memory address reference to either retrieve the value at that register:
MOV EDX, [EBX+EAX]
or store a value to that location:
MOV [EBX+EDX], ECX
Basic use of immediates vs. square brackets in YASM/NASM x86 assembly
Indeed, your thought is correct.That is, bl will contain 5 and cl the memory address of buffer(in fact the label buffer is a memory address itself).
Now, let me explain the differences between the operations you mentioned:
moving an immediate into a register can be done using
mov reg,imm
.What may be confusing is that labels e.g buffer are immediate values themselves that contain an address.You cannot really move a register into an immediate, since immediate values are constants, like
2
orFF1Ah
.What you can do is move a register to the place where the constant points to.You can do it likemov [const], reg
.You can also use indirect addressing like
mov reg2,[reg1]
provided reg1 points to a valid location, and it will transfer the value pointed by reg1 to reg2.
So, mov cl, buffer
will move the address of buffer to cl(which may or may not give the correct address, since cl is only one byte long) , whereas mov cl, [buffer]
will get the actual value.
Summary
- When you use [a], then you refer to the value at the place where a points to.For example, if a is
F5B1
, then [a] refers to the address F5B1 in RAM. - Labels are addresses,i.e values like
F5B1
. - Values stored in registers do not have to be referenced to as [reg] because registers do not have addresses.In fact, registers can be thought of as immediate values.
What do the brackets mean in NASM syntax for x86 asm?
[L1]
means the memory contents at address L1. After running mov al, [L1]
here, The al
register will receive the byte at address L1 (the letter 'w').
8086- why can't we move an immediate data into segment register?
Remember that the syntax of assembly language (any assembly) is just a human-readable way to write machine code. The rules of what you can do in machine code depend on how the processor's electronics were designed, not on what the assembler syntax could easily support.
So, just because it looks like you could write mov DS, 5000h
and that conceptually it doesn't seem like there is a reason why you shouldn't be able to do it, it's really about "is there a mechanism by which the processor can load a segment register directly from an immediate value?"
In the case of 8086 assembly, I figure that the reason is simply that the engineers just didn't create an electric path that could feed a signal from the memory I/O data lines to the lines that write to the segment registers.
Why? I have several theories, but no authoritative knowledge.
The most likely reason is simply one of simplifying the design: it takes extra wiring and gates to do that, and it's an uncommon enough operation (this is the 70's) that it's not worth the real estate in the chip. This is not surprising; the 8086 already went overboard allowing any of the normal registers to be connected to the ALU (arithmetic logic unit) which allows any register to be used as an accumulator. I'm sure that wasn't cheap to do. Most processors at the time only allowed one register (the accumulator) to be used for that purpose.
As far as the brackets, you are correct. Let's say memory position 5000h contains the number 4321h. mov ax, 5000h
puts the value 5000h into ax, while mov ax, [5000h]
loads 4321h from memory into ax. Essentially, the brackets act like the *
pointer dereference operator in C.
Just to highlight the fact that assembly is an idealized abstraction of what machine code can do, you should note that the two variations are not the same instruction with different parameters, but completely different opcodes. They could have used – say – MOV
for the first and MVD
(MoVe Direct addressed memory) for the second opcode, but they must have decided that the bracket syntax was easier for programmers to remember.
Related Topics
Why Do I Get /Etc/Cups Conflicts Between Attempted Installs in Yocto
Asp Net Core Linux Err_Connection_Refused
Curl Command Doesn't Work in Bash Script
Delete Files with Backslash in Linux
/Var/Log/Daemon.Log Taking More Space How to Reduce It
Differentiate Between Exit and Session Timeout
Django on Apache Wtih Mod_Wsgi (Linux) - 403 Forbidden
How to Ask Bash for the Current Options
How to Offload the 1Hz Tick in Dyntick Mode
How Come _Exit(0) (Exiting by Syscall) Prevents Me from Receiving Any Stdout Content
Old Logs Are Not Imported into Es by Logstash
Linux Script Start,Stop,Restart
Get and Use a Password with Special Characters in Bash Shell
How Can a Program Detect If It Is Running as a Systemd Daemon
Unusual Behaviour of Linux's Sort Command