Code Sequences for Tls on Arm

What are the real ELF TLS ABI requirements for each cpu arch?

The best I can gather so far is:

For either TLS variant, __tls_get_addr or other arch-specific functions must exist and have the correct semantics for looking up any TLS object, and the relative offset between any two TLS segments must be a runtime constant (same offset for each thread).

For TLS variant II (i386, etc.), the "thread pointer register" (which may not actually be a register, but perhaps some mechanism like %gs:0 or even a trap into kernelspace; for simplicity though let's just call it a register) points just past the end of the TLS segment for the main executable, where "just past the end" includes rounding up to the next multiple of the TLS segment's alignment.

For TLS variant I, the "thread pointer register" points to some fixed offset from the beginning of the TLS segment for the main executable. This offset varies by arch. (It has been chosen on some ugly RISC archs to maximize the amount of TLS accessible via signed 16-bit offsets, which strikes me as extremely useless since the compiler has no way of knowing whether the relocated offset will fit in 16 bits and thus must always generate the slower, larger 32-bit-offset code using load-upper/add instructions).

As far as I can tell, nothing about TCBs, DTVs, etc. is part of the ABI, in the sense that applications are not permitted to access these structures, nor is the location of any TLS segment other than the main executable's part of the ABI. In both variants I and II, it makes sense to store implementation-internal information for the thread at a fixed offset from the "thread pointer register", in whichever way safely avoids overlapping the TLS segment.

How to choose an AES encryption mode (CBC ECB CTR OCB CFB)?

ECB should not be used if encrypting more than one block of data with the same key.
CBC, OFB and CFB are similar, however OFB/CFB is better because you only need encryption and not decryption, which can save code space.
CTR is used if you want good parallelization (ie. speed), instead of CBC/OFB/CFB.
XTS mode is the most common if you are encoding a random accessible data (like a hard disk or RAM).
OCB is by far the best mode, as it allows encryption and authentication in a single pass. However there are patents on it in USA.

The only thing you really have to know is that ECB is not to be used unless you are only encrypting 1 block. XTS should be used if you are encrypting randomly accessed data and not a stream.

You should ALWAYS use unique IV's every time you encrypt, and they should be random. If you cannot guarantee they are random, use OCB as it only requires a nonce, not an IV, and there is a distinct difference. A nonce does not drop security if people can guess the next one, an IV can cause this problem.

Segfault occurs due to one line of code in C file and entire program does not run

EDIT: Read on for gory details, but the quick answer is, your FTP client is corrupting your program. This is an intentional feature of FTP, which can be turned off by typing binary at the FTP prompt before get whatever or put whatever. If you're using a graphical FTP client it should have a checkbox somewhere with the same effect. Or switch to scp, which does not have this inconvenient feature.

First off, there is no difference in the generated assembly code
between (one of the) working object files and the broken object file.

$ objdump -dr dc-good.o > dc-good.s
$ objdump -dr dc-bad.o > dc-bad.s
$ diff -u dc-good.s dc-bad.s
--- dc-good.s   2012-01-21 08:20:05.318518596 -0800
+++ dc-bad.s    2012-01-21 08:20:10.954566852 -0800
@@ -1,5 +1,5 @@

-dc-good.o:     file format elf32-littlearm
+dc-bad.o:     file format elf32-littlearm

 Disassembly of section .text:

In fact, there are only two bytes that differ between the good and
bad object files. (You misunderstood what I was asking for with
"test\r\n" versus "testX\n": I wanted the two strings to be the
same length, so that everything would have the same offset in the
object files. Fortunately, your compiler padded the shorter string to
the same length as the longer string, so everything has the same
offset anyway.)

$ hd dc-good.o > dc-good.x
$ hd dc-bad.o > dc-bad.x
$ diff -u1 dc-good.x dc-bad.x
--- dc-good.x   2012-01-21 08:17:28.713174977 -0800
+++ dc-bad.x    2012-01-21 08:17:39.129264489 -0800
@@ -154,3 +154,3 @@
 00000990  53 74 61 72 74 69 6e 67  20 70 72 6f 67 72 61 6d  |Starting program|
-000009a0  00 00 00 00 74 65 73 74  58 0a 00 00 2f 64 65 76  |....testX.../dev|
+000009a0  00 00 00 00 74 65 73 74  58 0d 0a 00 2f 64 65 76  |....testX.../dev|
 000009b0  2f 74 74 79 53 30 00 00  66 64 20 3d 20 25 75 0a  |/ttyS0..fd = %u.|
@@ -223,3 +223,3 @@
 00000de0  61 72 69 65 73 2f 64 61  74 61 63 6f 6c 6c 65 63  |aries/datacollec|
-00000df0  74 6f 72 2d 62 61 64 2d  62 69 6e 61 72 79 2d 32  |tor-bad-binary-2|
+00000df0  74 6f 72 2d 62 61 64 2d  62 69 6e 61 72 79 2d 31  |tor-bad-binary-1|
 00000e00  00 46 49 4c 45 00 5f 5f  73 74 61 74 65 00 5f 5f  |.FILE.__state.__|

The first difference is the difference that should be there: 74 65 73 74 58 0a 00 00 is the correct encoding of "test\n" (with one byte
of padding), 74 65 73 74 58 0d 0a 00 is the correct encoding of
"test\r\n". The other difference appears to be debugging
information: the name of the directory in which you compiled the
programs. This is harmless.

The object files are as they should be, so at this point we can rule
out a bug in the compiler or the assembler. Now let's look at the
executables.

$ hd dc-good > dc-good.xe
$ hd dc-bad > dc-bad.xe
$ diff -u1 dc-good.xe dc-bad.xe
--- dc-good.xe  2012-01-21 08:31:33.456437417 -0800
+++ dc-bad.xe   2012-01-21 08:31:38.388480238 -0800
@@ -120,3 +120,3 @@
 00000770  f0 af 1b e9 53 74 61 72  74 69 6e 67 20 70 72 6f  |....Starting pro|
-00000780  67 72 61 6d 00 00 00 00  74 65 73 74 58 0a 00 00  |gram....testX...|
+00000780  67 72 61 6d 00 00 00 00  74 65 73 74 58 0d 0a 00  |gram....testX...|
 00000790  2f 64 65 76 2f 74 74 79  53 30 00 00 66 64 20 3d  |/dev/ttyS0..fd =|
@@ -373,3 +373,3 @@
 00001750  63 6f 6c 6c 65 63 74 6f  72 2d 62 61 64 2d 62 69  |collector-bad-bi|
-00001760  6e 61 72 79 2d 32 00 46  49 4c 45 00 5f 5f 73 74  |nary-2.FILE.__st|
+00001760  6e 61 72 79 2d 31 00 46  49 4c 45 00 5f 5f 73 74  |nary-1.FILE.__st|
 00001770  61 74 65 00 5f 5f 67 63  73 00 73 74 64 6f 75 74  |ate.__gcs.stdout|

Same two differences, different offsets within the executable. This
is also as it should be. We can rule out a bug in the linker as well
(if it was screwing up the address of the string, it would have to be
screwing it up the same way in both executables and they both ought to
crash).

At this point I think we are looking at a bug in your C library or
kernel. To pin it down further, I would like you to try this test
script. Run it as sh testz.sh on the ARM board, and send us the
complete output.

#! /bin/sh

set -e
cat >testz.c <<\EOF
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#define W(f, s) write(f, s, sizeof s - 1)

int
main(int ac, char **av)
{
  int f;
  if (ac != 2) return 2;
  f = open(av[1], O_RDWR|O_NOCTTY|O_NONBLOCK);
  if (f == -1) return 1;

  W(f, "1\n");
  W(f, "12\n");
  W(f, "123\n");
  W(f, "1234\n");
  W(f, "12345\n");

  W(f, "1\r\n");
  W(f, "12\r\n");
  W(f, "123\r\n");
  W(f, "1234\r\n");

  close(f);
  return 0;
}
EOF

arm-linux-gcc -Wall -g testz.c -o testz
set +e
strace ./testz /dev/null
echo ----
strace ./testz /dev/ttyS0
echo ----
exit 0

I've looked at the damaged binary you provided and now I know what's wrong.

$ ls -l testz*
-rwxr-x--- 1 zack zack 7528 Dec 31  1979 testz-bad
-rwxr-x--- 1 zack zack 7532 Jan 21 16:35 testz-good

Ignore the odd datestamp; see how the -bad version is four bytes smaller than the -good version? There were exactly four \r characters in the source code. Let's have a look at the differences in the hex dumps. I've pulled out the interesting bit of the diff and shuffled it around a little to make it easier to see what's going on.

 00000620  00 00 00 00 31 32 33 34  0a 00 00 00 31 32 33 34  |....1234....1234|

-00000630  35 0a 00 00 31 0d 0a 00  31 32 0d 0a 00 00 00 00  |5...1...12......|
+00000630  35 0a 00 00 31 0a 00 31  32 0a 00 00 00 00 31 32  |5...1..12.....12|

-00000640  31 32 33 0d 0a 00 00 00  31 32 33 34 0d 0a 00 00  |123.....1234....|
+00000640  33 0a 00 00 00 31 32 33  34 0a 00 00 00 00 00 00  |3....1234.......|

-00000650  00 00 00 00 68 84 00 00  1c 84 00 00 00 00 00 00  |....h...........|
+00000650  68 84 00 00 1c 84 00 00  00 00 00 00 01 00 00 00  |h...............|

The file transfer is replacing 0d 0a (that is, \r\n) sequences with 0a (just \n). This causes everything after this point in the file to be displaced four bytes from where it's supposed to be. The code is before this point, and so are all the ELF headers that the kernel looks at, which is why you don't get

execve("./testz-bad", ["./testz-bad", "/dev/null"], [/* 36 vars */]) = -1 ENOEXEC (Exec format error)

from the test script; instead, you get a segfault inside the dynamic loader, because the DYNAMIC segment (which tells the dynamic loader what to do) is after the displacement starts.

$ readelf -d testz-bad 2> /dev/null

Dynamic section at offset 0x660 contains 13 entries:
  Tag        Type                         Name/Value
 0x00000035 (<unknown>: 35)              0xc
 0x0000832c (<unknown>: 832c)            0xd
 0x00008604 (<unknown>: 8604)            0x19
 0x00010654 (<unknown>: 10654)           0x1b
 0x00000004 (HASH)                       0x1a
 0x00010658 (<unknown>: 10658)           0x1c
 0x00000004 (HASH)                       0x4
 0x00008108 (<unknown>: 8108)            0x5
 0x0000825c (<unknown>: 825c)            0x6
 0x0000815c (<unknown>: 815c)            0xa
 0x00000098 (<unknown>: 98)              0xb
 0x00000010 (SYMBOLIC)                   0x15
 0x00000000 (NULL)                       0x3

Contrast:

$ readelf -d testz-good

Dynamic section at offset 0x660 contains 18 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.0]
 0x0000000c (INIT)                       0x832c
 0x0000000d (FINI)                       0x8604
 0x00000019 (INIT_ARRAY)                 0x10654
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001a (FINI_ARRAY)                 0x10658
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x8108
 0x00000005 (STRTAB)                     0x825c
 0x00000006 (SYMTAB)                     0x815c
 0x0000000a (STRSZ)                      152 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x10718
 0x00000002 (PLTRELSZ)                   56 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x82f4
 0x00000000 (NULL)                       0x0

The debugging information is also after the displacement, which is why gdb didn't like the program.

So why this very particular corruption? It's not a bug in anything; it's an intentional feature of your FTP client, which defaults to transferring files in "text mode", which means (among other things) that it converts DOS-style line endings (\r\n) to Unix-style (\n). Because that would be what you wanted if this were 1991 and you were transferring text files off your IBM PC to your institutional file server. It is basically never what is wanted nowadays, even if you are moving text files around. Fortunately, you can turn it off: just type binary at the FTP prompt before the file transfer commands. *Un*fortunately, as far as I know there is no way to make that stick; you have to do that every time. I recommend switching to scp, which always transfers files verbatim and is also easier to operate from build automation.

Code Sequences for Tls on Arm

What are the real ELF TLS ABI requirements for each cpu arch?

How to choose an AES encryption mode (CBC ECB CTR OCB CFB)?

Segfault occurs due to one line of code in C file and entire program does not run

Related Topics

Leave a reply