In Python, How to Escape Newline Characters When Printing a String

In Python, is it possible to escape newline characters when printing a string?

Another way that you can stop python using escape characters is to use a raw string like this:

>>> print(r"abc\ndef")
abc\ndef

or

>>> string = "abc\ndef"
>>> print (repr(string))
>>> 'abc\ndef'

the only proplem with using repr() is that it puts your string in single quotes, it can be handy if you want to use a quote

Print \n or newline characters as part of the output on terminal

Use repr

>>> string = "abcd\n"
>>> print(repr(string))
'abcd\n'

string formatting so newline character is visible in Python program

Try this:

data_string = repr(''.join(lst))

where lst is your list.

Replace all newline characters using python

I don't have access to your pdf file, so I processed one on my system. I also don't know if you need to remove all new lines or just double new lines. The code below remove double new lines, which makes the output more readable.

Please let me know if this works for your current needs.

from tika import parser

filename = 'myfile.pdf'

# Parse the PDF
parsedPDF = parser.from_file(filename)

# Extract the text content from the parsed PDF
pdf = parsedPDF["content"]

# Convert double newlines into single newlines
pdf = pdf.replace('\n\n', '\n')

#####################################
# Do something with the PDF
#####################################
print (pdf)

Write newline character explicitly in a file

Just replace the newline character with an escaped new line character

text = "where are you going?\nI am going to the Market?"
with open("output.txt",'w', encoding="utf-8") as output:
output.write(text.replace('\n','\\n'))

How do I specify new lines in a string in order to write multiple lines to a file?

It depends on how correct you want to be. \n will usually do the job. If you really want to get it right, you look up the newline character in the os package. (It's actually called linesep.)

Note: when writing to files using the Python API, do not use the os.linesep. Just use \n; Python automatically translates that to the proper newline character for your platform.

What is the difference between '\' and '\n' escape sequence in python

The book is confusing you by mixing two entirely different concepts.

  • \n is an escape sequence in a string literal. Like other \single-character and \xhh or \uhhhh escape sequences these work exactly like those in C; they define a character in the string that would otherwise be difficult to spell out when writing code.

  • \ at the end of a physical line of code extends the logical line. That is, Python will see text on the next line as part of the current line, making it one long line of code. This applies anywhere in Python code.

You can trivially see the difference when you print the results of strings that use either technique:

escape_sequence = "This is a line.\nThis is another line"
logical_line_extended = "This is a logical line. \
This is still the same logical line."

print(escape_sequence)
print(logical_line_extended)

This outputs

This is a line.
This is another line
This is a logical line. This is still the same logical line.

Note that the line breaks have swapped! The \n escape sequence in the string value caused the output to be broken across two lines (the terminal or console or whatever is displaying the printed data, knows how to interpret a newline character), while the newline in the logical_line_extended string literal definition is gone; it was never part of the string value being defined, it was a newline in the source code only.

Python lets you extend a line of code like this because Python defines how you delimit logical lines very differently from C. In C, you end statements with ;, and group blocks of lines with {...} curly braces. Newlines are not part of how C reads your code.

So, the following C code:

if (a) { foo = 'bar'; spam = 'ham'; }

is the same thing as

if (a) {
foo = 'bar';
spam = 'ham';
}

C knows where each statement starts and ends because the programmer has to use ; and {...} to delimit lines and blocks, the language doesn't care about indentation or newlines at all here. In Python however, you explicitly use newlines and indentation to define the same structure. So Python uses whitespace instead of {, } and ;.

This means you could end up with long lines of code to hold a complex expression:

# deliberately convoluted long expression to illustrate a point
expr = 18 ** (1 / 3) / (6 * (3 + sqrt(3) * I) ** (1 / 3)) + 12 ** (1 / 3) * (3 + sqrt(3) * I) ** (1 / 3) / 12

The point of \ is to allow you to break up such a long expression across multiple logical lines by extending the current line with \ at the end:

# deliberately convoluted long expression to illustrate a point
expr = 18 ** (1 / 3) / (6 * (3 + sqrt(3) * I) ** (1 / 3)) + \
12 ** (1 / 3) * (3 + sqrt(3) * I) ** (1 / 3) / 12

So the \ as the last character on a line, tells Python to ignore the newline that's there and continue treating the following line as part of the same logical line.

Python also extends the logical line when it has seen an opening (, [ or { brace, until the matching }, ] or ) brace is found to close the expression. This is the preferred method of extending lines. So the above expression could be broken up across multiple logical lines with:

expr = (18 ** (1 / 3) / (6 * (3 + sqrt(3) * I) ** (1 / 3)) +
12 ** (1 / 3) * (3 + sqrt(3) * I) ** (1 / 3) / 12)

You can do the same with strings:

long_string = (
"This is a longer string that does not contain any newline "
"*characters*, but is defined in the source code with "
"multiple strings across multiple logical lines."
)

This uses another C string literal trick Python borrowed: multiple consecutive string literals form one long string object once parsed and compiled.

See the Lexical analysis reference documentation:

2.1.5. Explicit line joining

Two or more physical lines may be joined into logical lines using backslash characters (\)[.]

[...]

2.1.6. Implicit line joining

Expressions in parentheses, square brackets or curly braces can be split over more than one physical line without using backslashes.

The same documentation lists all the permitted Python string escape sequences.

Replacing a text with \n in it, with a real \n output

If you're running this in the Python interpreter, it is the regular behavior of the interpreter to show newlines as "\n" instead of actual newlines, because it makes it easier to debug the output. If you want to get actual newlines within the interpreter, you should print the string you get.

If this is what the program is outputting (i.e.: You're getting newline escape sequences from the external program), you should use the following:

OUTPUT = stdout.read()
formatted_output = OUTPUT.replace('\\n', '\n').replace('\\t', '\t')
print formatted_output

This will replace escaped newlines by actual newlines in the output string.



Related Topics



Leave a reply



Submit