Since I Can't Return a Local Variable, What's the Best Way to Return a String from a C or C++ Function

Returning string from C function

Either allocate the string on the stack on the caller side and pass it to your function:

void getStr(char *wordd, int length) {
...
}

int main(void) {
char wordd[10 + 1];
getStr(wordd, sizeof(wordd) - 1);
...
}

Or make the string static in getStr:

char *getStr(void) {
static char wordd[10 + 1];
...
return wordd;
}

Or allocate the string on the heap:

char *getStr(int length) {
char *wordd = malloc(length + 1);
...
return wordd;
}

Returning a string from a function in C

You are returning a pointer to local data, which is not valid after the function returns. You have to allocate the string properly.

This can be done in either the calling function, by supplying a buffer to the called function, and it copies the string over to the supplied buffer. Like this:

char segmentFileName[SOME_SIZE];
getSegmentFileName(file, lineLength, mid, segmentFileName);

and the getSegmentFileName function:

void getSegmentFileName(FILE *file, int lineLength, int lineNumber, char *segmentFileName)
{
/* ... */

strcpy(segmentFileName, fileNameString);
}

The other solution is to allocate the memory for the string in getSegmentFileName:

return strdup(fileNameString);

but then you have to remember to free the string later.

Returning a C string from a function

Your function signature needs to be:

const char * myFunction()
{
return "my String";
}

Background:

It's so fundamental to C & C++, but little more discussion should be in order.

In C (& C++ for that matter), a string is just an array of bytes terminated with a zero byte - hence the term "string-zero" is used to represent this particular flavour of string. There are other kinds of strings, but in C (& C++), this flavour is inherently understood by the language itself. Other languages (Java, Pascal, etc.) use different methodologies to understand "my string".

If you ever use the Windows API (which is in C++), you'll see quite regularly function parameters like: "LPCSTR lpszName". The 'sz' part represents this notion of 'string-zero': an array of bytes with a null (/zero) terminator.

Clarification:

For the sake of this 'intro', I use the word 'bytes' and 'characters' interchangeably, because it's easier to learn this way. Be aware that there are other methods (wide-characters, and multi-byte character systems (mbcs)) that are used to cope with international characters. UTF-8 is an example of an mbcs. For the sake of intro, I quietly 'skip over' all of this.

Memory:

This means that a string like "my string" actually uses 9+1 (=10!) bytes. This is important to know when you finally get around to allocating strings dynamically.

So, without this 'terminating zero', you don't have a string. You have an array of characters (also called a buffer) hanging around in memory.

Longevity of data:

The use of the function this way:

const char * myFunction()
{
return "my String";
}

int main()
{
const char* szSomeString = myFunction(); // Fraught with problems
printf("%s", szSomeString);
}

... will generally land you with random unhandled-exceptions/segment faults and the like, especially 'down the road'.

In short, although my answer is correct - 9 times out of 10 you'll end up with a program that crashes if you use it that way, especially if you think it's 'good practice' to do it that way. In short: It's generally not.

For example, imagine some time in the future, the string now needs to be manipulated in some way. Generally, a coder will 'take the easy path' and (try to) write code like this:

const char * myFunction(const char* name)
{
char szBuffer[255];
snprintf(szBuffer, sizeof(szBuffer), "Hi %s", name);
return szBuffer;
}

That is, your program will crash because the compiler (may/may not) have released the memory used by szBuffer by the time the printf() in main() is called. (Your compiler should also warn you of such problems beforehand.)

There are two ways to return strings that won't barf so readily.

  1. returning buffers (static or dynamically allocated) that live for a while. In C++ use 'helper classes' (for example, std::string) to handle the longevity of data (which requires changing the function's return value), or
  2. pass a buffer to the function that gets filled in with information.

Note that it is impossible to use strings without using pointers in C. As I have shown, they are synonymous. Even in C++ with template classes, there are always buffers (that is, pointers) being used in the background.

So, to better answer the (now modified question). (There are sure to be a variety of 'other answers' that can be provided.)

Safer Answers:

Example 1, using statically allocated strings:

const char* calculateMonth(int month)
{
static char* months[] = {"Jan", "Feb", "Mar" .... };
static char badFood[] = "Unknown";
if (month < 1 || month > 12)
return badFood; // Choose whatever is appropriate for bad input. Crashing is never appropriate however.
else
return months[month-1];
}

int main()
{
printf("%s", calculateMonth(2)); // Prints "Feb"
}

What the static does here (many programmers do not like this type of 'allocation') is that the strings get put into the data segment of the program. That is, it's permanently allocated.

If you move over to C++ you'll use similar strategies:

class Foo
{
char _someData[12];
public:
const char* someFunction() const
{ // The final 'const' is to let the compiler know that nothing is changed in the class when this function is called.
return _someData;
}
}

... but it's probably easier to use helper classes, such as std::string, if you're writing the code for your own use (and not part of a library to be shared with others).

Example 2, using caller-defined buffers:

This is the more 'foolproof' way of passing strings around. The data returned isn't subject to manipulation by the calling party. That is, example 1 can easily be abused by a calling party and expose you to application faults. This way, it's much safer (albeit uses more lines of code):

void calculateMonth(int month, char* pszMonth, int buffersize)
{
const char* months[] = {"Jan", "Feb", "Mar" .... }; // Allocated dynamically during the function call. (Can be inefficient with a bad compiler)
if (!pszMonth || buffersize<1)
return; // Bad input. Let junk deal with junk data.
if (month<1 || month>12)
{
*pszMonth = '\0'; // Return an 'empty' string
// OR: strncpy(pszMonth, "Bad Month", buffersize-1);
}
else
{
strncpy(pszMonth, months[month-1], buffersize-1);
}
pszMonth[buffersize-1] = '\0'; // Ensure a valid terminating zero! Many people forget this!
}

int main()
{
char month[16]; // 16 bytes allocated here on the stack.
calculateMonth(3, month, sizeof(month));
printf("%s", month); // Prints "Mar"
}

There are lots of reasons why the second method is better, particularly if you're writing a library to be used by others (you don't need to lock into a particular allocation/deallocation scheme, third parties can't break your code, and you don't need to link to a specific memory management library), but like all code, it's up to you on what you like best. For that reason, most people opt for example 1 until they've been burnt so many times that they refuse to write it that way anymore ;)

Disclaimer:

I retired several years back and my C is a bit rusty now. This demo code should all compile properly with C (it is OK for any C++ compiler though).

returning a local variable from function in C

The issue here is that when you create the local variable it is allocated on the stack and is therefore unavailable once the function finishes execution (implementation varies here). The preferable way would be to use malloc() to reserve non-local memory. the danger here is that you have to deallocate (free()) everything you allocated using malloc(), and if you forget, you create a memory leak.

Life-time of a string literal in C

Yes, the lifetime of a local variable is within the scope({,}) in which it is created.

Local variables have automatic or local storage. Automatic because they are automatically destroyed once the scope within which they are created ends.

However, What you have here is a string literal, which is allocated in an implementation-defined read-only memory. String literals are different from local variables and they remain alive throughout the program lifetime. They have static duration [Ref 1] lifetime.

A word of caution!

However, note that any attempt to modify the contents of a string literal is an undefined behavior (UB). User programs are not allowed to modify the contents of a string literal.

Hence, it is always encouraged to use a const while declaring a string literal.

const char*p = "string"; 

instead of,

char*p = "string";    

In fact, in C++ it is deprecated to declare a string literal without the const though not in C. However, declaring a string literal with a const gives you the advantage that compilers would usually give you a warning in case you attempt to modify the string literal in the second case.

Sample program:

#include<string.h> 
int main()
{
char *str1 = "string Literal";
const char *str2 = "string Literal";
char source[]="Sample string";

strcpy(str1,source); // No warning or error just Undefined Behavior
strcpy(str2,source); // Compiler issues a warning

return 0;
}

Output:

cc1: warnings being treated as errors

prog.c: In function ‘main’:

prog.c:9: error: passing argument 1 of ‘strcpy’ discards qualifiers from pointer target type

Notice the compiler warns for the second case, but not for the first.


To answer the question being asked by a couple of users here:

What is the deal with integral literals?

In other words, is the following code valid?

int *foo()
{
return &(2);
}

The answer is, no this code is not valid. It is ill-formed and will give a compiler error.

Something like:

prog.c:3: error: lvalue required as unary ‘&’ operand

String literals are l-values, i.e: You can take the address of a string literal, but cannot change its contents.

However, any other literals (int, float, char, etc.) are r-values (the C standard uses the term the value of an expression for these) and their address cannot be taken at all.


[Ref 1]C99 standard 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

how to return a string array from a function

A string array in C can be used either with char** or with char*[]. However, you cannot return values stored on the stack, as in your function. If you want to return the string array, you have to reserve it dynamically:

char** myFunction() {
char ** sub_str = malloc(10 * sizeof(char*));
for (int i =0 ; i < 10; ++i)
sub_str[i] = malloc(20 * sizeof(char));
/* Fill the sub_str strings */
return sub_str;
}

Then, main can get the string array like this:

char** str = myFunction();
printf("%s", str[0]); /* Prints the first string. */

EDIT: Since we allocated sub_str, we now return a memory address that can be accessed in the main

Return char[]/string from a function

Notice you're not dynamically allocating the variable, which pretty much means the data inside str, in your function, will be lost by the end of the function.

You should have:

char * createStr() {

char char1= 'm';
char char2= 'y';

char *str = malloc(3);
str[0] = char1;
str[1] = char2;
str[2] = '\0';

return str;

}

Then, when you call the function, the type of the variable that will receive the data must match that of the function return. So, you should have:

char *returned_str = createStr();

It worths mentioning that the returned value must be freed to prevent memory leaks.

char *returned_str = createStr();

//doSomething
...

free(returned_str);


Related Topics



Leave a reply



Submit