How to (Un)Escape Strings in C/C++

How to (un)escape strings in C/C++?

This is a function to process a single character:

/*
** Does not generate hex character constants.
** Always generates triple-digit octal constants.
** Always generates escapes in preference to octal.
** Escape question mark to ensure no trigraphs are generated by repetitive use.
** Handling of 0x80..0xFF is locale-dependent (might be octal, might be literal).
*/

void chr_cstrlit(unsigned char u, char *buffer, size_t buflen)
{
    if (buflen < 2)
        *buffer = '\0';
    else if (isprint(u) && u != '\'' && u != '\"' && u != '\\' && u != '\?')
        sprintf(buffer, "%c", u);
    else if (buflen < 3)
        *buffer = '\0';
    else
    {
        switch (u)
        {
        case '\a':  strcpy(buffer, "\\a"); break;
        case '\b':  strcpy(buffer, "\\b"); break;
        case '\f':  strcpy(buffer, "\\f"); break;
        case '\n':  strcpy(buffer, "\\n"); break;
        case '\r':  strcpy(buffer, "\\r"); break;
        case '\t':  strcpy(buffer, "\\t"); break;
        case '\v':  strcpy(buffer, "\\v"); break;
        case '\\':  strcpy(buffer, "\\\\"); break;
        case '\'':  strcpy(buffer, "\\'"); break;
        case '\"':  strcpy(buffer, "\\\""); break;
        case '\?':  strcpy(buffer, "\\\?"); break;
        default:
            if (buflen < 5)
                *buffer = '\0';
            else
                sprintf(buffer, "\\%03o", u);
            break;
        }
    }
}

And this is the code to handle a null-terminated string (using the function above):

void str_cstrlit(const char *str, char *buffer, size_t buflen)
{
    unsigned char u;
    size_t len;

    while ((u = (unsigned char)*str++) != '\0')
    {
        chr_cstrlit(u, buffer, buflen);
        if ((len = strlen(buffer)) == 0)
            return;
        buffer += len;
        buflen -= len;
    }
    *buffer = '\0';
}

How to un-escape a backslash-escaped string?

>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"

(C#) unescape string

You aren't using a literal when adding in the extra wacks (\) . This will mean your string will escape the \ as it's being compiled and leave you with a single wack.

Alter your line:

string fFileName = @txtSelectedFolder.Text + "\\" + file.Name;

string fFileName = txtSelectedFolder.Text + @"\" + file.Name;

You don't need the @ literal symbol infront of your variable.

Alternatively:

You can instead use

string fFileName = Path.Combine(txtSelectedFolder.Text, file.Name);

to properly concatenate the file's name to the selected file's path.

Convert string with explicit escape sequence into relative character

I think that you must write such function yourself since escape characters is a compile-time feature, i.e. when you write "\n" the compiler would replace the \n sequence with the eol character. The resulting string is of length 1 (excluding the terminating zero character).

In your case a string "\\n" is of length 2 (again excluding terminating zero) and contains \ and n.

You need to scan your string and when encountering \ check the following char. if it is one of the legal escapes, you should replace both of them with the corresponding character, otherwise skip or leave them both as is.

( http://ideone.com/BvcDE ):

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}

Unescape escaped string?

Here's an elegant solution with a switch statement, the Regex.Replace Method and a custom MatchEvaluator:

var input = @"This is indented:\r\n\tHello World";

var output = Regex.Replace(input, @"\\[rnt]", m =>
{
    switch (m.Value)
    {
    case @"\r": return "\r";
    case @"\n": return "\n";
    case @"\t": return "\t";
    default: return m.Value;
    }
});

Console.WriteLine(output);

Output:


This is indented:
        Hello World

Un-Escape String received via Post Data

Well, there is no formal definition of the right terminology, but this kind of process is generally describing as "unescaping", or "parsing" rather than escaping. You would like to parse the application/x-www-form-urlencoded-encoded string.

And the answer is rather boring: you just do it. That's all. application/x/www-form-urlencoded only does two things: replace spaces with "+" signs, and replace most other kind of punctuation (including the real "+" sign itself) with %xx, where xx is the octet in hexadecimal.

So, you just roll up your sleeves, and do it. Scan the string, replace the + character with a space, and replace each occurence of %xx with the single character, the evaluated hexadecimal octet. There's nothing particularly mysterious about the process. It is exactly what it appears to be.

How to (Un)Escape Strings in C/C++