Long String Interpolation Lines in C#6

Long string interpolation lines in C#6

You can break the line into multiple lines, but I wouldn't say the syntax looks nice any more.

You need to use the $@ syntax to use an interpolated verbatim string, and you can place newlines inside the {...} parameters, like this:

string s = $@"This is all {
10
} going to be one long {
DateTime.Now
} line.";

The string above will not contain any newlines and will actually have content like this:

This is all 10 going to be one long 01.08.2015 23.49.47 line.

(note, norwegian format)

Now, having said that, I would not stop using string.Format. In my opinion some of these string interpolation expressions looks really good, but more complex ones starts to become very hard to read. Considering that unless you use FormattableString, the code will be compiled into a call to String.Format anyway, I would say keep going with String.Format where it makes sense.

Multiline C# interpolated string literal

You can combine $ and @ together to get a multiline interpolated string literal:

string s =
$@"Height: {height}
Width: {width}
Background: {background}";

Source: Long string interpolation lines in C#6 (Thanks to @Ric for finding the thread!)

Long string interpolation lines in C#6 don't support Tab,CR and LF

You have the verbatim modifier @ in front of that string, so your tab characters will be un-escaped and treated as normal text. If you want to include them in the string, then you can either enclose the characters in curley brackets (since you're also using the $ string interpolation modifier) so they're treated as tabs (same with the carriage return and newline characters):

    var name = "myname";
var text = $@"{"\t\t"}{name}
tab and name is in a Long string interpolation {"\r\n"}
";
Console.WriteLine(text);

Alternatively, since it's a verbatim string, you can just press Tab (or Enter) keys where you want those characters in the string.

This string is the same as the one above:

    var text = $@"      {name}
tab and name is in a Long string interpolation

";

How to get Multiline string

Take a look at interpolated strings for including your field values.

For the new lines, there are three distinct items to be aware of: the Enivronment.NewLine constant, the \n character literal, and the HTML <br> element. The last is especially important for emails which use an HTML body.

In this case, I suggest a C# mult-line string literal, with an interpolated value for an extra end line you can set or not as needed based on the platform.

So you end up with this:

string br = ""; // set this to "<br>" for HTML emails

string EmailBody =
$@"Request no.- ('{objUserModel.ID}') has been raised for special{br}
vehicle by requester-( '{RequesterInfo}').{br}
{br}
ORG_Unit:'{objUserModel.OrgUnit}'{br}
TDC:'{objUserModel.TDC}'{br}
Customer Name:'{objUserModel.CustName}'{br}
Supply Plant:'{objUserModel.CustName}'";

The original code was having trouble because of the concatenation. Each line of code for the string was broken up at the end like this: " +, to continue again with a new string literal on the next line. In this way, the line breaks in the code were lost in the resulting string.

The code here addresses this issue by putting everything into the same string literal. There is only one set of double quotes defining the entire string, which now includes the line breaks built into the code. The one thing to be aware of with this technique is you want to shift your strings all the way to the left, regardless of any other indentation in the code.

Is there a way to use C# 6's String Interpolation with multi-line string?

Try swapping the places of $ and @.

Is there a way to split interpolated strings over multiple lines in C# whilst executing the same at run time in terms of performance

TLDR String.Format is being called for interpolation, so concatenating strings that are being interpolated means more calls to String.Format

Let's look at the IL

To get a better idea of what is actually going on when you have these questions is good to check out the IL (Intermediate Language), which is what your code is compiled into to then run on the .NET runtime. You can use ildasm for inspecting IL of compiled .NET DLLs and EXEs.

Concatenating Multiple Strings

So here you can see that behind the scenes, String.Format is being called for each of the concatenated strings.

Concatenated strings

Using One Long String

Here you see that String format is only being called once, meaning if you're talking about performance this way would be slightly better.

One string

Split long interpolated string

What does the compiler do?

Let's start here:

var a = $"Some value 1: {b1:0.00}\n" +
$"Some value 2: {b2}\n" +
$"Some value 3: {b3:0.00000}\n" +
$"Some value 4: {b4:0.00}\n" +
$"Some value 5: {b6:0.0}\n" +
$"Some value 7: {b7:0.000000000}";

IL is a black box for me yet

Why not simply Open it up? That's pretty easy using a tool like ILSpy, Reflector, etc.

What will happen in your code is that each line is compiled to a string.Format. The rule is pretty simple: if you have $"...{X}...{Y}..." it will be compiled as string.Format("...{0}...{1}...", X, Y). Also the + operator will introduce a string concatenation.

In more detail, string.Format is a simple static call, which means that the compiler will use the call opcode instead of callvirt.

From all this you might deduce that it's pretty easy for a compiler to optimize this: if we have an expression like constant string + constant string + ... you can simply replace it with constant string. You can argue that the compiler has knowledge about the inner workings of string.Format and string concatenation and handle that. On the other hand, you could argue that it should not. Let me detail the two considerations:

Note that strings are objects in .NET, but they are 'special ones'. You can see this from the fact that there's a special ldstr opcode, but also if you check out what happens if you switch on a string -- the compiler will generate a dictionary. So, from this you could deduce that the compiler 'knows' how a string works internally. Let's figure out if it knows how to do concatenation, ok?

var str = "foo" + "bar";
Console.WriteLine(str);

In IL (Release mode of course) this will give:

L_0000: ldstr "foobar"

tl;dr: So, regardless if the concatenation of interpolated strings are already implemented or not (they are not), I'd be pretty confident that the compiler will handle this case eventually.

What does the JIT do?

Next question would be: how smart is the JIT compiler with strings?

So, let's consider for a moment that we will teach the compiler about all the inner workings of string. First we should note that C# is compiled to IL, which is JIT compiled to assembler. In the case of the switch it's pretty hard for the JIT compiler to create the dictionary, so we have to do it in the compiler. On the other hand, if we're handling more complex concatenation it makes sense to use the things we already have available for f.ex. integer arithmetic to do string operations as well. This implies putting string operations in the JIT compiler. Let's for a moment consider that with an example:

var str = "";
for (int i=0; i<10; ++i) {
str += "foo";
}
Console.WriteLine(str);

The compiler will simply compile the concatenation to IL, which means that the IL will hold a pretty straight-forward implementation of this. In this case loop unrolling arguably has a lot of benefits for the (runtime) performance of the program: it can simply unroll the loop, appending the string 10 times, which results in a simple constant.

However, giving this knowledge to the JIT compiler makes it more complex, which means that the runtime will spend more time on JIT compiling (figuring out the optimization) and less time executing (running the emitted assembler). Question that remains is: what will happen?

Start the program, put a breakpoint on the writeline and hit ctrl-alt-D and see the assembler.

00007FFCC8044413  jmp         00007FFCC804443F  
{
str += "foo";
00007FFCC8044415 mov rdx,2BEE2093610h
00007FFCC804441F mov rdx,qword ptr [rdx]
00007FFCC8044422 mov rcx,qword ptr [rbp-18h]
00007FFCC8044426 call 00007FFD26434CC0

[...]
00007FFCC804443A inc eax
00007FFCC804443C mov dword ptr [rbp-0Ch],eax
00007FFCC804443F mov ecx,dword ptr [rbp-0Ch]
00007FFCC8044442 cmp ecx,0Ah
00007FFCC8044445 jl 00007FFCC8044415

tl;dr: Nope, that's not optimized.

But I want the JIT to optimize that as well!

Yea, well, I'm not too sure if I share that opinion. There's a balance between runtime performance and time spent in JIT compilation. Notice that if you're doing something like this in a tight loop, I would argue that you're asking for trouble. On the other hand, if it's a common and trivial case (like the constants that are concatenated) it's pretty easy to optimize and it doesn't affect the runtime.

In other words: arguably, you don't want this to be optimized by the JIT, assuming that would take too much time. I'm confident we can trust Microsoft in making this decision wisely.

Also, you should realize that strings in .NET are heavily optimized things. We all know that they're used a lot, and so does Microsoft. If you're not writing 'really stupid code', it's a very reasonable assumption that it will perform just fine (until proven otherwise).

Alternatives?

What are other options to split long interpolated string?

Use resources. Resources are a useful tool in dealing with multiple languages. And if this is just a small, non-professional project - I simply wouldn't bother at all.

Alternatively you can use the fact that constant strings are concatenated:

var fmt = "Some value 1: {1:0.00}\n" +
"Some value 2: {2}\n" +
"Some value 3: {3:0.00000}\n" +
"Some value 4: {4:0.00}\n" +
"Some value 5: {6:0.0}\n" +
"Some value 7: {7:0.000000000}";

var a = string.Format(fmt, b1, b2, b3, b4, b5, b6, b7);


Related Topics



Leave a reply



Submit