C++ Multiline String Literal

How to split a string literal across multiple lines in C / Objective-C?

There are two ways to split strings over multiple lines:

  1. Each string on its own line. Works only with strings:

    • Plain C:

      char *my_string = "Line 1 "
      "Line 2";
    • Objective-C:

      NSString *my_string = @"Line1 "
      "Line2"; // the second @ is optional
  2. Using \ - can be used for any expression:

    • Plain C:

      char *my_string = "Line 1 \
      Line 2";
    • Objective-C:

      NSString *my_string = @"Line1 \
      Line2";

The first approach is better, because there isn't a lot of whitespace included. For a SQL query however, both are possible.

NOTE: With a #define, you have to add an extra \ to concatenate the two strings:

Plain C:

#define kMyString "Line 1"\
"Line 2"

C++ multiline string literal

Well ... Sort of. The easiest is to just use the fact that adjacent string literals are concatenated by the compiler:

const char *text =
"This text is pretty long, but will be "
"concatenated into just a single string. "
"The disadvantage is that you have to quote "
"each part, and newlines must be literal as "
"usual.";

The indentation doesn't matter, since it's not inside the quotes.

You can also do this, as long as you take care to escape the embedded newline. Failure to do so, like my first answer did, will not compile:


const char *text2 =
"Here, on the other hand, I've gone crazy \
and really let the literal span several lines, \
without bothering with quoting each line's \
content. This works, but you can't indent.";

Again, note those backslashes at the end of each line, they must be immediately before the line ends, they are escaping the newline in the source, so that everything acts as if the newline wasn't there. You don't get newlines in the string at the locations where you had backslashes. With this form, you obviously can't indent the text since the indentation would then become part of the string, garbling it with random spaces.

#define string over multiple lines

Just get rid of the whitespace in between the lines, and quote the whole thing. A \ at the EOL basically "escapes" the newline, so it won't be part of the string itself. It's only relevant for the preprocessor:

#include <stdio.h>

#define LONG_STRING "C++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String Literalaa\
bbbb\
ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc"

int main ( void ) {
printf(LONG_STRING);
return 0;
}

That works just fine

For aesthetic reasons, you can quote each line separately, the only requirement is you add the \ directly after the closing quotes:

#define LONG_STRING "C++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String LiteralC++ Multiline String Literalaa"\
"bbbbbb"\
"ccccccccccccccccccccccccccccccccccccccccccccc"

This, too, works just fine

Note:

The two suggestions are not 100% equivalent. The first version defines a macro to be a single string literal. The second version defines the macro as 3 separate string literals. For the most part, this isn't a big deal, because during the translation phase, adjacent string literal tokens should be concatenated:

5.1.1.2 Translation phases:

[...]

6. Adjacent string literal tokens are concatenated.

7. White-space characters separating tokens are no longer significant. Each
preprocessing token is converted into a token. The resulting tokens are
syntactically and semantically analyzed and translated as a translation unit.

I could not find the footnote Meninx mentions about C99 behaving differently. Document I used can be found here

C++: multiline string literal with concatenation?

std::to_string should help you out with this one

#define theValue 1000

static std::string str = R"(

foo )" + std::to_string(theValue) + R"( bar

)";

In the general case, you need strings or something that can be implicitly converted into a string in order to use std::string's concatenation operator.

Why does C not recognize strings across multiple lines?

We can think of a C program as a series of tokens: groups of characters that can't be split up without changing their meaning. Identifiers and keywords are tokens. So are operators like + and -, punctuation marks such as the comma
and semicolon, and string literals.

For example, the line

int i; int j;

consists of 6 tokens: int, i, ;, int, j and ;. Most of the time, and particularly in this case, the amount of space (space, tab and newline characters) is not critical. That's why the compiler will treat

int           i
;int
j;

The same.

Writing

"Hello
Hello"

Is like writing

un signed

and hope that the compiler treat it as

unsigned

Just like space is not allowed between a keyword, newline character is not allowed in a string literal token. But it can be included using the newline escape '\n' when needed.

To write strings across lines use string concatenation method

"Hello"
"Hello"

Although the above method is recommended, you can also use a backslash

"Hello \
Hello"

With the backslash method, beware of the beginning space in a new line. The string will include everything in that line until it finds a closing quote or another backslash.

Can you use C/C++ preprocessor tokens in multiline string literals

You have two problems. The first is that preprocessor tokens inside quotes (i.e. string literals) aren't substituted. The second is that you must defer the actual stringification until all preprocessing tokens have been replaced. The stringification must be the very last macro that the preprocessor deals with.

Token substitution happens iterativly. The preprocessor deals with the substitution, and then goes back to see if there is anything left to substitute in the sequence it just replaced. We need to use it to our advantage. If we have an hypothetical TO_STRING macro, we need the very next iteration to substitute all preprocessing tokens, and only the one after that to produce a call to the "real" stringification. Fortunately, it's fairly simple to write:

#define TO_STRING(...) DEFER(TO_STRING_)(__VA_ARGS__)
#define DEFER(x) x
#define TO_STRING_(...) #__VA_ARGS__

#define SOME_CONSTANT 64

#define QUOTE(...) TO_STRING(__VA_ARGS__)
const char * aString = QUOTE({
"key":"fred",
"value": TO_STRING(SOME_CONSTANT)
});

Live example

We need the DEFER macro because the preprocessor won't substitute inside something that it recognizes as an argument to another macro. The trick here, is that the x in DEFER(TO_STRING_)(x) is not an argument to a macro. So it's substituted in the same go as DEFER(TO_STRING_). And what we get as a result is TO_STRING_(substituted_x). That becomes a macro invocation in the next iteration. So the preprocessor will perform the substitution dictated by TO_STRING_, on the previously substituted x.

Defining a string over multiple lines

The newline continuation takes into account any whitespace within the code.

You can take advantage of string literal concatenation for better readability:

sprintf(buffer, "This Is The "
"Longest String In the World "
"that in text goes on and..");

Using \ you'll need to begin the continuation of your string at column 0:

sprintf(buffer, "This Is The \
Longest String In the World \
that in text goes on and..");

C++ multiline string raw literal

Note that raw string literals are delimited by R"( and )" (or you can add to the delimiter by adding characters between the quote and the parens if you need additional 'uniqueness').

#include <iostream>
#include <ostream>
#include <string>

int main ()
{
// raw-string literal example with the literal made up of separate, concatenated literals
std::string s = R"(abc)"
R"( followed by not a newline: \n)"
" which is then followed by a non-raw literal that's concatenated \n with"
" an embedded non-raw newline";

std::cout << s << std::endl;

return 0;
}

Multiline string literal in C#

You can use the @ symbol in front of a string to form a verbatim string literal:

string query = @"SELECT foo, bar
FROM table
WHERE id = 42";

You also do not have to escape special characters when you use this method, except for double quotes as shown in Jon Skeet's answer.

Using prefix with string literal split over multiple lines

From this cppreference page, it would appear that all your code snippets are equivalent and well-defined:

Concatenation



If one of the strings has an encoding prefix and the other doesn't, the one that doesn't will be considered to have the same encoding prefix as the other.

Or, from this Draft C++17 Standard:

5.13.5 String literals      [lex.string]



13      In translation phase 6 (5.2), adjacent string-literals are concatenated. If both string-literals have the same encoding-prefix, the resulting concatenated string literal has that encoding-prefix. If one string-literal has no encoding-prefix, it is treated as a string-literal of the same encoding-prefix as the other operand. …



Related Topics



Leave a reply



Submit