Escape Command Line Arguments in C#

Escape command line arguments in c#


It's more complicated than that though!

I was having related problem (writing front-end .exe that will call the back-end with all parameters passed + some extra ones) and so i looked how people do that, ran into your question. Initially all seemed good doing it as you suggest arg.Replace (@"\", @"\\").Replace(quote, @"\"+quote).

However when i call with arguments c:\temp a\\b, this gets passed as c:\temp and a\\b, which leads to the back-end being called with "c:\\temp" "a\\\\b" - which is incorrect, because there that will be two arguments c:\\temp and a\\\\b - not what we wanted! We have been overzealous in escapes (windows is not unix!).

And so i read in detail http://msdn.microsoft.com/en-us/library/system.environment.getcommandlineargs.aspx and it actually describes there how those cases are handled: backslashes are treated as escape only in front of double quote.

There is a twist to it in how multiple \ are handled there, the explanation can leave one dizzy for a while. I'll try to re-phrase said unescape rule here: say we have a substring of N \, followed by ". When unescaping, we replace that substring with int(N/2) \ and iff N was odd, we add " at the end.

The encoding for such decoding would go like that: for an argument, find each substring of 0-or-more \ followed by " and replace it by twice-as-many \, followed by \". Which we can do like so:

s = Regex.Replace(arg, @"(\\*)" + "\"", @"$1$1\" + "\"");

That's all...

PS. ... not. Wait, wait - there is more! :)

We did the encoding correctly but there is a twist because you are enclosing all parameters in double-quotes (in case there are spaces in some of them). There is a boundary issue - in case a parameter ends on \, adding " after it will break the meaning of closing quote. Example c:\one\ two parsed to c:\one\ and two then will be re-assembled to "c:\one\" "two" that will me (mis)understood as one argument c:\one" two (I tried that, i am not making it up). So what we need in addition is to check if argument ends on \ and if so, double the number of backslashes at the end, like so:

s = "\"" + Regex.Replace(s, @"(\\+)$", @"$1$1") + "\"";

Why does C# appear to partially un-escape command line arguments?

Admittedly this relates to Microsoft C Command Line Arguments, but I have tested that these rules are also followed for C#. Command line arguments are separated into the string args[] array prior to being passed in to the application.

Parsing C Command-Line Arguments

  • Arguments are delimited by white space, which is either a space or a tab.

  • A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument. Note that the caret (^) is not recognized as an escape character or delimiter.

  • A double quotation mark preceded by a backslash, \", is interpreted as a literal double quotation mark (").

  • Backslashes are interpreted literally, unless they immediately precede a double quotation mark.

  • If an even number of backslashes is followed by a double quotation mark, then one backslash () is placed in the argv array for every pair of backslashes (\), and the double quotation mark (") is interpreted as a string delimiter.

  • If an odd number of backslashes is followed by a double quotation mark, then one backslash () is placed in the argv array for every pair of backslashes (\) and the double quotation mark is interpreted as an escape sequence by the remaining backslash, causing a literal double quotation mark (") to be placed in argv.

These rules seem to agree with the results you are seeing.

I ran the following console application to test these rules:

static void Main(string[] args)
{
foreach (string s in args)
{
Console.WriteLine(s);
}
Console.ReadLine();
}

With the following command line arguments:

arg1 "arg2 arg3" arg4\" "arg5\"" arg6\\\"

Output:

arg1
arg2 arg3
arg4"
arg5"
arg6\"

The reason your input argument appears unescaped is that the first double-quotation is interpreted to be the starting string delimiter, and the second double-quotation is escaped by the preceding backslash and interpreted as a literal double-quotation - not an ending delimiter.

Escaping command line arguments in C# for Urls and Local and Network Paths

The requirement to escape \ inside a string if it is not a verbatim string (one that starts with @) is a C# feature. When you start your application from a console, you are outside of C#, and the console does not consider \ to be a special character, so C:\test> myapplication.exe "C:\temp\january" will work.

Edit: My original post had "C:\temp\january\" above; however, the Windows command line seems to also handle \ as an escape character - but only when in front of a ", so that command would pass C:\temp\january" to the application. Thanks to @zimdanen for pointing this out.

Please note that whatever you put between quotes in C# is a representation of a string; the actual string may be different - for instance, \\ represents a single \. If you use other means to get strings into the program, such as the command line arguments or by reading from a file, the strings do not need to follow C#'s rules for string literals. The command line has different rules for representation, in which a \ represents itself.

Backslash and quote in command-line arguments

According to this article by Jon Galloway, there can be weird behaviour experienced when using backslashes in command line arguments.

Most notably it mentions that "Most applications (including .NET applications) use CommandLineToArgvW to decode their command lines. It uses crazy escaping rules which explain the behaviour you're seeing."

It explains that the first set of backslashes do not require escaping, but backslashes coming after alpha (maybe numeric too?) characters require escaping and that quotes always need to be escaped.

Based off of these rules, I believe to get the arguments you want you would have to pass them as:

a "b" "\\x\\\\" "\x\\"

"Whacky" indeed.


The full story of the crazy escaping rules was told in 2011 by an MS blog entry: "Everyone quotes command line arguments the wrong way"

Raymond also had something to say on the matter (already back in 2010): "What's up with the strange treatment of quotation marks and backslashes by CommandLineToArgvW"

The situation persists into 2020 and the escaping rules described in Everyone quotes command line arguments the wrong way are still correct as of 2020 and Windows 10.

C# escape & char in arguments path when running cmd.exe /C

It works fine if the path is double-quoted:

var arguments = "/c \"\"C:\\Here & Here\\MyExe.exe\"\"";

Safely escaping arguments on the command line in C#

CreateProcess function accepts two distinct parameters, lpApplicationName and lpCommandLine.

If lpApplicationName is NULL, lpCommandLine will be parsed for tokens to determine the executable, otherwise it will not and will be passed to the process, unchanged.

As mentioned by Raymond Chen.

So I would say, provided your startInfo.FileName comes from a trusted source, you are safe to pass arguments as is. Now, the application being run may fail to properly analyse them and do something bogus in case they are malformed, but that's a different story.

Passing command-line arguments in C#

I just ran a check and verified the problem. It surprised me, but it is the last \ in the first argument.

"C:\Program Files\Application name\" <== remove the last '\'

This needs more explanation, does anybody have an idea? I'm inclined to call it a bug.


Part 2, I ran a few more tests and

"X:\\aa aa\\" "X:\\aa aa\" next

becomes

X:\\aa aa\
X:\\aa aa" next

A little Google action gives some insight from a blog by Jon Galloway, the basic rules are:

  • the backslash is the escape character
  • always escape quotes
  • only escape backslashes when they precede a quote.


Related Topics



Leave a reply



Submit