How to (un)escape strings in C/C++?
This is a function to process a single character:
/*
** Does not generate hex character constants.
** Always generates triple-digit octal constants.
** Always generates escapes in preference to octal.
** Escape question mark to ensure no trigraphs are generated by repetitive use.
** Handling of 0x80..0xFF is locale-dependent (might be octal, might be literal).
*/
void chr_cstrlit(unsigned char u, char *buffer, size_t buflen)
{
if (buflen < 2)
*buffer = '\0';
else if (isprint(u) && u != '\'' && u != '\"' && u != '\\' && u != '\?')
sprintf(buffer, "%c", u);
else if (buflen < 3)
*buffer = '\0';
else
{
switch (u)
{
case '\a': strcpy(buffer, "\\a"); break;
case '\b': strcpy(buffer, "\\b"); break;
case '\f': strcpy(buffer, "\\f"); break;
case '\n': strcpy(buffer, "\\n"); break;
case '\r': strcpy(buffer, "\\r"); break;
case '\t': strcpy(buffer, "\\t"); break;
case '\v': strcpy(buffer, "\\v"); break;
case '\\': strcpy(buffer, "\\\\"); break;
case '\'': strcpy(buffer, "\\'"); break;
case '\"': strcpy(buffer, "\\\""); break;
case '\?': strcpy(buffer, "\\\?"); break;
default:
if (buflen < 5)
*buffer = '\0';
else
sprintf(buffer, "\\%03o", u);
break;
}
}
}
And this is the code to handle a null-terminated string (using the function above):
void str_cstrlit(const char *str, char *buffer, size_t buflen)
{
unsigned char u;
size_t len;
while ((u = (unsigned char)*str++) != '\0')
{
chr_cstrlit(u, buffer, buflen);
if ((len = strlen(buffer)) == 0)
return;
buffer += len;
buflen -= len;
}
*buffer = '\0';
}
How to un-escape a backslash-escaped string?
>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"
(C#) unescape string
You aren't using a literal when adding in the extra wacks (\
) . This will mean your string will escape the \
as it's being compiled and leave you with a single wack.
Alter your line:
string fFileName = @txtSelectedFolder.Text + "\\" + file.Name;
to
string fFileName = txtSelectedFolder.Text + @"\" + file.Name;
You don't need the @
literal symbol infront of your variable.
Alternatively:
You can instead use
string fFileName = Path.Combine(txtSelectedFolder.Text, file.Name);
to properly concatenate the file's name to the selected file's path.
Convert string with explicit escape sequence into relative character
I think that you must write such function yourself since escape characters is a compile-time feature, i.e. when you write "\n"
the compiler would replace the \n
sequence with the eol character. The resulting string is of length 1 (excluding the terminating zero character).
In your case a string "\\n"
is of length 2 (again excluding terminating zero) and contains \
and n
.
You need to scan your string and when encountering \
check the following char. if it is one of the legal escapes, you should replace both of them with the corresponding character, otherwise skip or leave them both as is.
( http://ideone.com/BvcDE ):
string unescape(const string& s)
{
string res;
string::const_iterator it = s.begin();
while (it != s.end())
{
char c = *it++;
if (c == '\\' && it != s.end())
{
switch (*it++) {
case '\\': c = '\\'; break;
case 'n': c = '\n'; break;
case 't': c = '\t'; break;
// all other escapes
default:
// invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
continue;
}
}
res += c;
}
return res;
}
Unescape escaped string?
Here's an elegant solution with a switch statement, the Regex.Replace Method and a custom MatchEvaluator:
var input = @"This is indented:\r\n\tHello World";
var output = Regex.Replace(input, @"\\[rnt]", m =>
{
switch (m.Value)
{
case @"\r": return "\r";
case @"\n": return "\n";
case @"\t": return "\t";
default: return m.Value;
}
});
Console.WriteLine(output);
Output:
This is indented:
Hello World
Un-Escape String received via Post Data
Well, there is no formal definition of the right terminology, but this kind of process is generally describing as "unescaping", or "parsing" rather than escaping. You would like to parse the application/x-www-form-urlencoded
-encoded string.
And the answer is rather boring: you just do it. That's all. application/x/www-form-urlencoded
only does two things: replace spaces with "+" signs, and replace most other kind of punctuation (including the real "+" sign itself) with %xx
, where xx
is the octet in hexadecimal.
So, you just roll up your sleeves, and do it. Scan the string, replace the +
character with a space, and replace each occurence of %xx
with the single character, the evaluated hexadecimal octet. There's nothing particularly mysterious about the process. It is exactly what it appears to be.
Related Topics
Global Function Definition in Header File - How to Avoid Duplicated Symbol Linkage Error
Opengl Gl_Polygon Concave Polygon Doesn't Color In
Why Is Cuda Pinned Memory So Fast
App Does Not Run with VS 2008 Sp1 Dlls, Previous Version Works with Rtm Versions
Is There Any Lame C++ Wrapper\Simplifier (Working on Linux MAC and Win from Pure Code)
Differencebetween a MACro and a Const in C++
Why Is Copy Constructor Called Instead of Conversion Constructor
Valgrind Memory Leak Errors When Using Pthread_Create
Cyclic Dependency Between Header Files
At What Point Does Dereferencing the Null Pointer Become Undefined Behavior
Easy Rule to Read Complicated Const Declarations
Object Layout in Case of Virtual Functions and Multiple Inheritance
How to Count Clock Cycles with Rdtsc in Gcc X86
Will a "Variablename;" C++ Statement Be a No-Op at All Times