Will Memcpy or Memmove Cause Problems Copying Classes

Will memcpy or memmove cause problems copying classes?

In the general case, yes, there will be problems. Both memcpy and memmove are bitwise operations with no further semantics. That might not be sufficient to move the object*, and it is clearly not enough to copy.

In the case of the copy it will break as multiple objects will be referring to the same dynamically allocated memory, and more than one destructor will try to release it. Note that solutions like shared_ptr will not help here, as sharing ownership is part of the further semantics that memcpy/memmove don't offer.

For moving, and depending on the type you might get away with it in some cases. But it won't work if the objects hold pointers/references to the elements being moved (including self-references) as the pointers will be bitwise copied (again, no further semantics of copying/moving) and will refer to the old locations.

The general answer is still the same: don't.


* Don't take move here in the exact C++11 sense. I have seen an implementation of the standard library containers that used special tags to enable moving objects while growing buffers through the use of memcpy, but it required explicit annotations in the stored types that marked the objects as safely movable through memcpy, after the objects were placed in the new buffer the old buffer was discarded without calling any destructors (C++11 move requires leaving the object in a destructible state, which cannot be achieved through this hack)

memcpy() vs memmove()

I'm not entirely surprised that your example exhibits no strange behaviour. Try copying str1 to str1+2 instead and see what happens then. (May not actually make a difference, depends on compiler/libraries.)

In general, memcpy is implemented in a simple (but fast) manner. Simplistically, it just loops over the data (in order), copying from one location to the other. This can result in the source being overwritten while it's being read.

Memmove does more work to ensure it handles the overlap correctly.

EDIT:

(Unfortunately, I can't find decent examples, but these will do). Contrast the memcpy and memmove implementations shown here. memcpy just loops, while memmove performs a test to determine which direction to loop in to avoid corrupting the data. These implementations are rather simple. Most high-performance implementations are more complicated (involving copying word-size blocks at a time rather than bytes).

Can I use memcpy in C++ to copy classes that have no pointers or virtual functions

According to the Standard, if no copy constructor is provided by the programmer for a class, the compiler will synthesize a constructor which exhibits default memberwise initialization. (12.8.8) However, in 12.8.1, the Standard also says,

A class object can be copied in two
ways, by initialization (12.1, 8.5),
including for function argument
passing (5.2.2) and for function value
return (6.6.3), and by assignment
(5.17). Conceptually, these two
operations are implemented by a copy
constructor (12.1) and copy assignment
operator (13.5.3).

The operative word here is "conceptually," which, according to Lippman gives compiler designers an 'out' to actually doing memberwise initialization in "trivial" (12.8.6) implicitly defined copy constructors.

In practice, then, compilers have to synthesize copy constructors for these classes that exhibit behavior as if they were doing memberwise initialization. But if the class exhibits "Bitwise Copy Semantics" (Lippman, p. 43) then the compiler does not have to synthesize a copy constructor (which would result in a function call, possibly inlined) and do bitwise copy instead. This claim is apparently backed up in the ARM, but I haven't looked this up yet.

Using a compiler to validate that something is Standard-compliant is always a bad idea, but compiling your code and viewing the resulting assembly seems to verify that the compiler is not doing memberwise initialization in a synthesized copy constructor, but doing a memcpy instead:

#include <cstdlib>

class MyClass
{
public:
MyClass(){};
int a,b,c;
double x,y,z;
};

int main()
{
MyClass c;
MyClass d = c;

return 0;
}

The assembly generated for MyClass d = c; is:

000000013F441048  lea         rdi,[d] 
000000013F44104D lea rsi,[c]
000000013F441052 mov ecx,28h
000000013F441057 rep movs byte ptr [rdi],byte ptr [rsi]

...where 28h is the sizeof(MyClass).

This was compiled under MSVC9 in Debug mode.

EDIT:

The long and the short of this post is that:

1) So long as doing a bitwise copy will exhibit the same side effects as memberwise copy would, the Standard allows trivial implicit copy constructors to do a memcpy instead of memberwise copies.

2) Some compilers actually do memcpys instead of synthesizing a trivial copy constructor which does memberwise copies.

memmove vs copying backwards

Assuming a good implementation, the only "extra cost" of memmove is the initial check (an add and a compare-and-branch) to decide whether to copy front-to-back or back-to-front. This cost is so completely negligible (the add and compare will be hidden by ILP and the branch is perfectly predictable under normal circumstances) that on some platforms, memcpy is just an alias of memmove.

In anticipation of your next question ("if memcpy isn't significantly faster than memmove, why does it exist?"), there are a few good reasons to keep memcpy around. The best one, to my mind, is that some CPUs essentially implement memcpy as a single instruction (rep/movs on x86, for example). These HW implementations often have a preferred (fast) direction of operation (or they may only support copying in one direction). A compiler may freely replace memcpy with the fastest instruction sequence without worrying about these details; it cannot do the same for memmove.

Is memmove copying 0 bytes but referencing out of bounds safe

The memmove function will copy n bytes. If n is zero, it will do nothing.

The only possible issue is with this, where index is already at the maximum value for array elements:

&arr[index + 1]

However, you are permitted to refer to array elements (in terms of having a pointer point to them) within the array or the hypothetical element just beyond the end of the array.

You may not dereference the latter but you're not doing that here. In other words, while arr[index + 1] on its own would attempt a dereference and therefore be invalid, evaluating the address of it is fine.

This is covered, albeit tangentially, in C++20 [expr.add]:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined.

Note the if 0 ≤ i + j ≤ n clause, particularly the final . For an array int x[10], the expression &(x[10]) is valid.

It's also covered in [basic.compound] (my emphasis):

A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively.

When should I use memcpy and when should I use memmove?

You should use memmove when there's a chance that the source and destination buffers overlap - it is specified to work in that case, whereas memcpy isn't.

In theory memcpy can be faster, if only because it doesn't check for overlapping memory buffers.



Related Topics



Leave a reply



Submit