How to Escape Latex Code Received Through User Input

How can I escape latex code received through user input?

Python’s raw strings are just a way to tell the Python interpreter that it should interpret backslashes as literal slashes. If you read strings entered by the user, they are already past the point where they could have been raw. Also, user input is most likely read in literally, i.e. “raw”.

This means the interpreting happens somewhere else. But if you know that it happens, why not escape the backslashes for whatever is interpreting it?

s = s.replace("\\", "\\\\")

(Note that you can't do r"\" as “a raw string cannot end in a single backslash”, but I could have used r"\\" as well for the second argument.)

If that doesn’t work, your user input is for some arcane reason interpreting the backslashes, so you’ll need a way to tell it to stop that.

How to escape/strip special characters in the LaTeX document?

The only possibility (AFAIK) to perform harmful operations using LaTeX is to enable the possibility to call external commands using \write18. This only works if you run LaTeX with the --shell-escape or --enable-write18 argument (depending on your distribution).

So as long as you do not run it with one of these arguments you should be safe without the need to filter out any parts.

Besides that, one is still able to write other files using the \newwrite, \openout and \write commands. Having the user create and (over)write files might be unwanted? So you could filter out occurrences of these commands. But keeping blacklists of certain commands is prone to fail since someone with a bad intention can easily hide the actual command by obfusticating the input document.

Edit: Running the LaTeX command using a limited account (ie no writing to non latex/project related directories) in combination with disabling \write18 might be easier and more secure than keeping a blacklist of 'dangerous' commands.

Unicode escape not working with user input

You'll want to unicode_escape the string itself:

input_int = int(input("Please enter the number of a unicode character: "))
# note that the `r` here prevents the `SyntaxError` you're seeing here
# `r` is for "raw string" in that it doesn't interpret escape sequences
# but allows literal backslashes
escaped_str = r"\u{}".format(input_int) # or `rf'\u{input_int}'` py36+
import codecs
print(codecs.decode(escaped_str, 'unicode-escape'))

A sample session:

>>> input_int = int(input("Please enter the number of a unicode character: "))
Please enter the number of a unicode character: 2603
>>> escaped_str = r"\u{}".format(input_int) # or `rf'\u{input_int}'` py36+
>>> import codecs
>>> print(codecs.decode(escaped_str, 'unicode-escape'))

Can't escape control character \r when extracting file paths

Use a raw string:

line = r"C:\recipe\1,C:\recipe\2,C:\recipe\3,"

How can I save shell output to a variable in LaTeX?

TeX has support for file IO, and you can take advantage of this. To create a new input filehandle, you execute \newread\readFH; at this point, \readFH is a number representing a channel on which you can read or write (you've already seen one of these, the special channel 18). To open the file, you run \openin\readFH=filename.ext; now, reading from channel \readFH will read lines from filename.ext. To actually read from the file, you run \read\readFH to \nextline; this reads one line from \readFH and puts it in \nextline. (With one caveat—see below.) Closing the file is then done with \closein\readFH.

Note that this treats newlines as spaces; if you have a file containing e.g.

foo
bar

and read a line into \nextline, it will be as though you wrote \def\nextline{foo }. To avoid this, you set \endlinechar to -1.

Overall, then, your example would look like this:

\newread\myinput
% We use '\jobname.temp' to create a uniquely-named temporary file
\immediate\write18{some command > '\jobname.temp'}
\openin\myinput=\jobname.temp
% The group localizes the change to \endlinechar
\bgroup
\endlinechar=-1
\read\myinput to \localline
% Since everything in the group is local, we have to explicitly make the
% assignment global
\global\let\myresult\localline
\egroup
\closein\myinput
% Clean up after ourselves
\immediate\write18{rm -f -- '\jobname.temp'}
\dosomething{\myresult}

You could probably abstract this into a macro, but precisely what parts ought to be parametrized probably depends on your specific use case.

The one caveat I mentioned: \read operates on balanced TeX text. This means that it will read more than one line if it has to in order to match braces, and it will finish all those lines. Thus, a single read from the file

Hello { all
you } people.

will result in the string Hello { all you } people. 

Also, just for future reference: creating filehandles to write to is done with \newwrite, you open them with \immediate\openout\writeFH=filename.ext, you write to them with \immediate\write\writeFH{some text}, and you close them with \immediate\closeout\writeFH. Note that some text is actual TeX; thus, macros are expanded and printing out e.g. backslashes or unbalanced curly brackets is non-trivial. (For backslashes, there's the LaTeX macro \@backslashchar; see the TeX Stack Exchange question "How to make a real backslash (escape) character" for more general information. The short version is "play with catcodes.")

Furthermore, note the use of \immediate for the writing commands but not the reading commands; if you don't use \immediate with these, they'll be delayed until the output routine runs (that is, when TeX decides that the current page is finished and can be shipped out to the DVI/PDF). According to chapter 21 of The TeXBook, "The reason for this delay is that \write is often used to make an index or table of contents, and the exact page on which a partcular item will appear is generally unknown when the \write instruction occurs in mid-paragraph." (Pg. 227.) For \read and friends, this clearly isn't a concern, hence the difference in behavior.

How do a print a backslash-escaped string to output '\n' in Python?

Use repr() to print a value as Python code:

print(repr('\n'))

...will emit:

'\n'

If you want to strip leading and trailing characters, then:

print(repr('\n')[1:-1])

...will emit only

\n

...but this is not futureproof (some strings may be emitted with different quoting, if not today, then in future implementations; including the literal quotes in output is thus the safe option).


Note that in a format string, you can use the !r modifier to indicate that you want repr() applied:

print('Value: {!r}'.format('\n'))

...will emit:

Value: '\n'

How does one insert a backslash or a tilde (~) into LaTeX?

TL;DR

\textbackslash produces a backslash in text-mode. The math-mode $\sim$ and \texttildelow (from textcomp package) are options for a lower tilde (while \~{} and \textasciitilde produce a raised tilde in text-mode)



Long Answer:

The Comprehensive LaTeX Symbol List is your friend. The correct link seems to keep changing, but if you have a complete TeX Live installation, the command texdoc symbols-a4 will display your local copy.

\textbackslash and \textasciitilde are found in several places in the document, but the LaTeX 2e ASCII Table (Table 529 as of this writing) and the following discussion are a convenient resource for all ASCII characters. In particular, the discussion notes that \~{} and \textasciitilde produce a raised tilde, whilst the math-mode $\sim$ and \texttildelow are options for a lower tilde; the latter is in the textcomp package, and looks best in fonts other than Computer Modern. If you are typesetting file names or urls, the document recommends the url package.

Remember to delimit TeX macros from surrounding text, e.g. bar\textasciitilde{}foo.

Handling input with back slashes in python 3

Or simply try to change "\\" to "//",
where filepath contains :

the
here
moz//12//14
the/

the code :

my_dict={}

def reader():
inputfile= open('<filepath>', 'r')
for line in inputfile:
my_dict[line.strip()]=0
return my_dict

print(reader())

the result :

{'the/': 0, 'the': 0, 'moz//12//14': 0, 'here': 0}

because when you write :

print("hello\\bonjour\\")

it gives :

hello\bonjour\


Related Topics



Leave a reply



Submit