Segfault when storing reg to var in section .DATA
What kind of system/object format are you using? I'm guessing you're using ELF on Linux or Unix, as that would explain your problem:
Section names in ELF are case sensitive, and most ELF-based OS's the special sections .text
and .data
are understood, but your sections .TEXT
and .DATA
have no meaning. As a result, they just get stuck into the executable after the other sections and get the same access permissions. If you're just linking the above code, that will be after the .fini
section, so it will executable and read-only. So when you try to write to the variable, you get a segfault.
Change your code to use .data
and .text
as section names and it should work.
segmentation fault with .text .data and main (main in .data section)
It seems the label main
is in the .data
section.
It leads to a segmentation fault on systems that doesn't allow to execute code in the .data
section. (Most modern systems map .data
with read + write but not exec permission.)
Program code should be in the .text
section. (Read + exec)
Surprisingly, on GNU/Linux systems, hand-written asm often results in an executable .data
unless you're careful to avoid that, so this is often not the real problem: See Why data and stack segments are executable? But putting code in .text
where it belongs can make some debugging tools work better.
Also you need to ret
from main or call exit
(or make an _exit
system call) so execution doesn't fall off the end of main
into whatever bytes come next. See What happens if there is no exit system call in an assembly program?
Why would I have a segmentation fault when declaring a variable in a C program?
It's probably just that your variable is too big for the available stack size on your platform. The code is technically fine for C (so, not a compile-time error), but in practice the operating system does not reserve enough stack space to make this possible.
After all, your langues_parles field takes, on its own, 250 * 500 bytes of space; i.e. 125kB. You've got three fields like that, and then some other fields, so each instance of the structure takes around 380kB.
Now, you haven't shown the value of NB_MAX_PARTICIPANTS, but my guess is that 380kB * NB_MAX_PARTICIPANTS is just too big. For example, on Windows, the default stack size is only 1MB, so if NB_MAX_PARTICIPANTS is bigger than 2, then the variable is too big (and that's assuming there is nothing else on the stack).
You will have to allocate your structure on the heap using malloc() or a similar function:
Ens_participants* les_participants = malloc(sizeof(Ens_participants));
/* ... */
free(les_participants);
Segmentation fault when using DB (define byte) inside a function
If you tell the assembler to assemble arbitrary bytes somewhere, it will. db
is a pseudo-instruction that emits bytes, so mov eax, 60
and db 0xb8, 0x3c, 0, 0, 0
are exactly equivalent as far as NASM is concerned. Either one will emit those 5 bytes into the output at the current position.
If you don't want your data decoded as (part of) instructions, don't put it where it will be reached by execution.
Since you're using NASM1, it optimizes mov rax,60
into mov eax,60
, so the instruction doesn't have the REX prefix you'd expect from the source.
Your manually-encoded REX prefix for mov
changes it into a mov
to R8D instead of EAX:41 b8 3c 00 00 00 mov r8d,0x3c
(I checked with objdump -drwC -Mintel
instead of looking up which bit is which in the REX prefix. I only remember that REX.W is 0x48
. But 0x41
is a REX.B prefix in x86-64).
So instead of making a sys_exit
system call, your code runs syscall
with EAX=0, which is __NR_read
. (The Linux kernel zeros all the registers other than RSP before process startup, and in a statically-linked executable, _start
is the true entry point with no dynamic linker code running first. So RAX is still zero).
$ strace ./rex
execve("./rex", ["./rex"], 0x7fffbbadad60 /* 54 vars */) = 0
read(0, NULL, 0) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
And then execution falls through into whatever is after syscall
, which in this case is 00 00
bytes that decode as add [rax], al
, and thus segfault. You would have seen this if you'd run your code inside GDB.
Footnote 1: If you'd used YASM which doesn't optimize to 32-bit operand size:
Intel's manuals say that it's illegal to have 2 REX prefixes on one instruction. I expected an illegal-instruction fault (#UD machine exception -> kernel delivers SIGILL), but my Skylake CPU ignores the first REX prefix and decodes it as mov rax, sign_extended_imm32
.
Single-stepping, it's treated as one long instructions, so I guess Skylake chooses to handle it like other cases of multiple prefixes, where only the last one of a type has an effect. (But remember this is not future-proof, other x86 CPUs could handle it differently.)
Related / same bug in other situations:
- Assembly (x86): <label> db 'string',0 does not get executed unless there's a jump instruction in a BIOS MBR boot sector
- Unknown opcode skipped: 66, not 8086 instruction - not supported yet in EMU8086
Related Topics
Hash ("#") Symbol in /Etc/Environment Causes String to Be Split
Can Someone Explain the Shell Shock Bash Code
Is It Safe to Call Dlclose(Null)
How to Install Rpy2 via Conda Using Default R Installation
How to Keep Program Running in Background in Ash Shell
How to Properly Quote This Bash Pipeline for Watch
What Is a Good Linux Exit Error Code Strategy
Shell Script Issue with Filenames Containing Spaces
Add Month to a Variable Date in Shell Script
How to Merge Two Seperate - Yet Similar - Codebases into One Svn Rep
How to Inform Gcc to Not Use a Particular Register
Why Doesn't Linux Accept() Return Eintr
Script Produces Different Result When Executed by Bash Than by Cron
Delete Files with Backslash in Linux
Why Doesn't Set -E Cause a Failure with 'False || False && True'
How to Create an Alias in Linux
Embed All External References When Creating a Static Library