How to Call Memcpy() and Memmove() with "Number of Bytes" Set to Zero

Can I call memcpy() and memmove() with number of bytes set to zero?

From the C99 standard (7.21.1/2):

Where an argument declared as size_t n specifies the length of the array for a
function, n can have the value zero on a call to that function. Unless explicitly stated
otherwise in the description of a particular function in this subclause, pointer arguments
on such a call shall still have valid values, as described in 7.1.4. On such a call, a
function that locates a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies characters copies zero
characters.

So the answer is no; the check is not necessary (or yes; you can pass zero).

Is this allowed: memcpy(dest, src, 0)

As one might expect from a sane interface, zero is a valid size, and results in nothing happening. It's specifically allowed by the specification of the various string handling functions (including memcpy) in C99 7.21.1/2:

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. [...] On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

Is memmove copying 0 bytes but referencing out of bounds safe

The memmove function will copy n bytes. If n is zero, it will do nothing.

The only possible issue is with this, where index is already at the maximum value for array elements:

&arr[index + 1]

However, you are permitted to refer to array elements (in terms of having a pointer point to them) within the array or the hypothetical element just beyond the end of the array.

You may not dereference the latter but you're not doing that here. In other words, while arr[index + 1] on its own would attempt a dereference and therefore be invalid, evaluating the address of it is fine.

This is covered, albeit tangentially, in C++20 [expr.add]:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined.

Note the if 0 ≤ i + j ≤ n clause, particularly the final . For an array int x[10], the expression &(x[10]) is valid.

It's also covered in [basic.compound] (my emphasis):

A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively.

How to set number of bytes with memcpy?

If you take a look a the the memcpy Man page, the third argument is the number of bytes that is copied from src to dst. So it doesn't matter if you use the size of src or size of dst. But you must ensure that the source and destination buffer sizes are at least equal or greater than the number of bytes copied. Otherwise, buffer overflow will occur.

Memcpy: wrong number of bytes

memcpy(backUp, mainMat, columns * lines * sizeof(int));

You are calling memcpy as though mainMat points at or into an array of arrays. But this is not possible, since mainMat is a pointer to pointer. It probably points into an array of pointers into arrays, instead. Since an array is NOT a pointer, these two types are not compatible.

In fact, your memcpy isn't necessarily copying the int objects at all: it is copying the bytes of some int* pointers into and out of the allocated memory.

If you need to copy all the int objects out of and into the storage associated with mainMat, you will need to loop over the elements of mainMat and copy out of and into each pointer it contains. This will copy one row or one column at a time (depending on your matrix orientation convention).

Does memcpy copy bytes in reverse order?

No, memcpy did not reverse the bytes as it copied them. That would be a strange and wrong thing for memcpy to do.

The reason the bytes seem to be in the "wrong" order in the program you wrote is that that's the order they're actually in! There's probably a canonical answer on this somewhere, but here's what you need to understand about byte order, or "endianness".

When you declare a string, it's laid out in memory just about exactly as you expect. Suppose I write this little code fragment:

#include <stdio.h>

char string[] = "Hello";
printf("address of string: %p\n", (void *)&string);
printf("address of 1st char: %p\n", (void *)&string[0]);
printf("address of 5th char: %p\n", (void *)&string[4]);

If I compile and run it, I get something like this:

address of string:   0xe90a49c2
address of 1st char: 0xe90a49c2
address of 5th char: 0xe90a49c6

This tells me that the bytes of the string are laid out in memory like this:

0xe90a49c2    H
0xe90a49c3 e
0xe90a49c4 l
0xe90a49c5 l
0xe90a49c6 o
0xe90a49c7 \0

Here I've shown the string vertically, but if we laid it out horizontally, with addresses increasing from left to right, we would see the characters of the string "Hello" laid out from left to right also, just as we would expect.

But that's for strings, which are arrays of char. But integers of various sizes are not really built out of characters, and it turns out that the individual bytes of an integer are not necessarily laid out in memory in "left-to-right" order as we might expect. In fact, on the vast majority of machines today, the bytes within an integer are laid out in the opposite order. Let's take a closer look at how that works.

Suppose I write this code:

int16_t i2 = 0x1234;
printf("address of short: %p\n", (void *)&i2);
unsigned char *p = &i2;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);

This initializes a 16-bit (or "short") integer to the hex value 0x1234, and then uses a pointer to print the two bytes of the integer in "left-to-right" order, that is, with the lower-addressed byte first, followed by the higher-addressed byte.
On my machine, the result is something like:

address of short:    0xe68c99c8
0xe68c99c8: 34
0xe68c99c9: 12

You can clearly see that the byte that's stored at the "front" of the two-byte region in memory is 34, followed by 12. The least-significant byte is stored first. This is referred to as "little endian" byte order, because the "little end" of the integer — its least-significant byte, or LSB — comes first.

Larger integers work the same way:

int32_t i4 = 0x5678abcd;
printf("address of long: %p\n", (void *)&i4);
p = &i4;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);

This prints:

address of long:     0xe68c99bc
0xe68c99bc: cd
0xe68c99bd: ab
0xe68c99be: 78
0xe68c99bf: 56

There are machines that lay the byes out in the other order, with the most-significant byte (MSB) first. Those are called "big endian" machines, but for reasons I won't go into they're not as popular.

How do you construct an integer value out of individual bytes if you don't know your machine's byte order? The best way is to do it "mathematically", based on the properties of the numbers. For example, let's go back to your original array of bytes:

uint8_t message[2] = {0xfd, 0x58};

Now, you know, because you wrote it, that 0xfd is supposed to be the MSB and 0xf8 is supposed to be the LSB. So one good way of combining them together into an integer is like this:

int16_t roll = message[0] << 8;    /* MSB */
roll |= message[1]; /* LSB */

The nice thing about this code is that it works correctly on machines of either endianness. I called this technique "mathematical" because it's equivalent to doing it this other way:

int16_t roll = message[0] * 256;   /* MSB */
roll += message[1]; /* LSB */

And, in fact, this suggestion of mine involving roll = message[0] << 8 is very close to something you already tried, but had commented out in the code you posted. The difference is that you don't want to think about it in terms of two bytes next to each other in memory; you want to think about it in terms of the most- and least-significant byte. When you say << 8, you're obviously thinking about the most-significant byte, so that should be message[0].

Why memcpy/memmove reverse data when copying int to bytes buffer?

It's because the processor architecture you use is little endian. Multibyte numbers (anything bigger than a uint8_t) are stored with the least significant byte at the lowest address.

Edit

What you do about it really depends on what the buffer is for. If you are only going to be using the buffer internally, forget about byte swapping, you'll have to do it in both directions and its a waste of time.

If it is for some external entity e.g. a file or a network protocol, the specification of the file or network protocol will say what the endianness is. For example, network byte order for all the Internet protocols is effectively big endian. The networking library provides a family of functions to convert values for use in sending and receiving Internet protocol messages. Se for instance

https://linux.die.net/man/3/htonl

If you want to roll your own, the portable way is to use bit shifts e.g.

void writeUInt32ToBufferBigEndian(uint32_t number, uint8_t* buffer)
{
buffer[0] = (uint8_t) ((number >> 24) & 0xff);
buffer[1] = (uint8_t) ((number >> 16) & 0xff);
buffer[2] = (uint8_t) ((number >> 8) & 0xff);
buffer[3] = (uint8_t) ((number >> 0) & 0xff);
}


Related Topics



Leave a reply



Submit