Strdup or _Strdup

strdup or _strdup?

strdup is not a standard C++ function. but it is apparently a Posix function, and anyway it's a well known function which has been there since K&R C. so if you absolutely must use it, do not fret about any possible name collision, and just write strdup for maximum portability.

Why do Windows and Linux have different strdup implementations: strdup() and _strdup()?

There are several functions that are part of the POSIX specification, i.e. Linux and most other UNIX variants, that are not part of standard C. These include strdup, write, read, and others.

The reasoning for the leading underscore is as follows, taken from the MSDN docs:

The Universal C Run-Time Library (UCRT) supports most of the C
standard library required for C++ conformance. It implements the C99
(ISO/IEC 9899:1999) library, with certain exceptions: The type-generic
macros defined in , and strict type compatibility in
. The UCRT also implements a large subset of the POSIX.1
(ISO/IEC 9945-1:1996, the POSIX System Application Program Interface)
C library. However, it's not fully conformant to any specific POSIX
standard.
The UCRT also implements several Microsoft-specific
functions and macros that aren't part of a standard.

Functions specific to the Microsoft implementation of Visual C++ are
found in the vcruntime library. Many of these functions are for
internal use and can't be called by user code. Some are documented for
use in debugging and implementation compatibility.

The C++ standard reserves names that begin with an underscore in the
global namespace to the implementation. Both the POSIX functions and
Microsoft-specific runtime library functions are in the global
namespace, but aren't part of the standard C runtime library. That's
why the preferred Microsoft implementations of these functions have a
leading underscore.
For portability, the UCRT also supports the
default names, but the Microsoft C++ compiler issues a deprecation
warning when code that uses them is compiled. Only the default names
are deprecated, not the functions themselves. To suppress the warning,
define _CRT_NONSTDC_NO_WARNINGS before including any headers in code
that uses the original POSIX names.

I've handled that by having a #define that check if the program is being compiled for Windows, and if so create another #define to map the POSIX name to the Windows specific name. There are a few choices you can check, although probably the most reliable is _MSC_VER which is defined if MSVC is the compiler.

#ifdef _MSC_VER
#define strdup(p) _strdup(p)
#endif

Replace _strdup by _strdup

I (not on purpose) modified a Microsoft file. That is why i got this weird error

strdup() - what does it do in C?

Exactly what it sounds like, assuming you're used to the abbreviated way in which C and UNIX assigns words, it duplicates strings :-)

Keeping in mind it's actually not part of the current (C17) ISO C standard itself(a) (it's a POSIX thing), it's effectively doing the same as the following code:

char *strdup(const char *src) {
char *dst = malloc(strlen (src) + 1); // Space for length plus nul
if (dst == NULL) return NULL; // No memory
strcpy(dst, src); // Copy the characters
return dst; // Return the new string
}

In other words:

  1. It tries to allocate enough memory to hold the old string (plus a '\0' character to mark the end of the string).

  2. If the allocation failed, it sets errno to ENOMEM and returns NULL immediately. Setting of errno to ENOMEM is something malloc does in POSIX so we don't need to explicitly do it in our strdup. If you're not POSIX compliant, ISO C doesn't actually mandate the existence of ENOMEM so I haven't included that here(b).

  3. Otherwise the allocation worked so we copy the old string to the new string(c) and return the new address (which the caller is responsible for freeing at some point).

Keep in mind that's the conceptual definition. Any library writer worth their salary may have provided heavily optimised code targeting the particular processor being used.

One other thing to keep in mind, it looks like this is currently slated to be in the C2x iteration of the standard, along with strndup, as per draft N2912 of the document.


(a) However, functions starting with str and a lower case letter are reserved by the standard for future directions. From C11 7.1.3 Reserved identifiers:

Each header declares or defines all identifiers listed in its associated sub-clause, and optionally declares or defines identifiers listed in its associated future library directions sub-clause.*

The future directions for string.h can be found in C11 7.31.13 String handling <string.h>:

Function names that begin with str, mem, or wcs and a lowercase letter may be added to the declarations in the <string.h> header.

So you should probably call it something else if you want to be safe.


(b) The change would basically be replacing if (d == NULL) return NULL; with:

if (d == NULL) {
errno = ENOMEM;
return NULL;
}

(c) Note that I use strcpy for that since that clearly shows the intent. In some implementations, it may be faster (since you already know the length) to use memcpy, as they may allow for transferring the data in larger chunks, or in parallel. Or it may not :-) Optimisation mantra #1: "measure, don't guess".

In any case, should you decide to go that route, you would do something like:

char *strdup(const char *src) {
size_t len = strlen(src) + 1; // String plus '\0'
char *dst = malloc(len); // Allocate space
if (dst == NULL) return NULL; // No memory
memcpy (dst, src, len); // Copy the block
return dst; // Return the new string
}

How to delete a const char* created with _strdup()

There are several problems here.

  1. You delete[] instead of free.

    strdup comes from the C library. The documentation tells us how to clean it up.

    Microsoft's similar _strdup works the same way.

    You must read the documentation for functions that you use, particularly if you're having trouble with them. That is why it is there.

  2. You invoke the destructor of A manually, when you shouldn't.

    The object has automatic storage duration, and will be destroyed automatically. When you for some reason call the destructor yourself, that means it'll be ultimately called twice. That means the erroneous deallocation call delete[] myChar will also be called twice, which is clearly worng.

  3. Your object's copy semantics are broken.

    Okay, so you don't copy it here. But any object that manages memory should follow the rule of zero, the rule of three, or the rule of five.

  4. You're checking for leaks too early.

    myA is still alive when you call _CrtDumpMemoryLeaks(), so of course it's going to see that it hasn't been destroyed/freed yet, and deem that to be a memory leak. You're supposed to call that function after you've attempted to rid yourself of all your resources, not before.

Here's your directly fixed code:

#include "pch.h"
#include <iostream>

class A
{
public:
A(const char *fn) {
myChar = _strdup(fn);
}

A(const A& other) {
myChar = _strdup(other.myChar);
}

A& operator=(const A& other) {
if (&other != this) {
free(myChar);
myChar = _strdup(other.myChar);
}

return *this;
}

~A() {
free(myChar);
}

char *myChar;
};

int main()
{
{
A myA("lala");
}

_CrtDumpMemoryLeaks(); //leak detector
}

And here's what it should have been:

#include <string>
#include <utility> // for std::move
#include <crtdbg.h> // for _CrtDumpMemoryLeaks

class A
{
public:
A(std::string str) : m_str(std::move(str)) {}

private:
std::string str;
};

int main()
{
{
A myA("lala");
}

_CrtDumpMemoryLeaks(); // leak detector
}

strdup error on g++ with c++0x

-std=gnu++0x (instead of -std=c++0x) does the trick for me; -D_GNU_SOURCE didn't work (I tried with a cross-compiler, but perhaps it works with other kinds of g++).

It appears that the default (no -std=... passed) is "GNU C++" and not "strict standard C++", so the flag for "don't change anything except for upgrading to C++11" is -std=gnu++0x, not -std=c++0x; the latter means "upgrade to C++11 and be stricter than by default".

error: 'strdup' was not declared in this scope

strdup is not a standard C function. When a compiler is configured to be strict C compliant, it is not allowed to dump its own custom, non-standard functions in standard library headers like <string.h>.

You can resolve this by changing the compiler to compile non-standard C code (for example in gcc, compile with -std=gnu11 instead of -std=c11). Or alternatively, stick to pure standard C.


... or just implement strdup yourself, it is easy:

#include <string.h>
#include <stdlib.h>

char* strdup (const char* s)
{
size_t slen = strlen(s);
char* result = malloc(slen + 1);
if(result == NULL)
{
return NULL;
}

memcpy(result, s, slen+1);
return result;
}

Is this a portable strdup

Read carefully the C11 standard n1570. For things like Arduino, you might not even have any malloc, and you could have some free-standing implemention of C11.

As a concrete example, the Linux kernel is coded in C, and your code could not be part of it.

Also see this answer providing some funny implementation of malloc (you could extend it to calloc...)

Will this code work properly with C standard compliant compilers in the future.

Of course not in general.

Look on many examples of toy operating systems on OSDEV.

You could have some operating system without  malloc or calloc but with a C compiler.

I have a c program I'm writing that I want to be portable and future compliant.

The motto I read somewhere is: there is no such thing as portable programs, just programs which have been ported (to some given systems)

Did you consider some approach inspired by GNU autoconf ? Your build automation could use preprocessing tricks and/or generate C code with e.g. GPP, SWIG or other tools.

Why is strdup considered to be evil

Two reasons I can think of:

  1. It's not strictly ANSI C, but rather POSIX. Consequently, some compilers (e.g. MSVC) discourage use (MSVC prefers _strdup), and technically the C standard could define its own strdup with different semantics since str is a reserved prefix. So, there are some potential portability concerns with its use.
  2. It hides its memory allocation. Most other str functions don't allocate memory, so users might be misled (as you say) into believing the returned string doesn't need to be freed.

But, aside from these points, I think that careful use of strdup is justified, as it can reduce code duplication and provides a nice implementation for common idioms (such as strdup("constant string") to get a mutable, returnable copy of a literal string).



Related Topics



Leave a reply



Submit