How to Print Unicode Character in Python

Print unicode character in Python 3

It seems that you are doing this using Windows command line.

chcp 65001
set PYTHONIOENCODING=utf-8

You can try to run above command first before running python3. It will set the console encoder to utf-8 that can represent your data.

How does one print a Unicode character code in Python?

For printing raw unicode data one only need specify the correct encoding:

>>> s = u'\u0103'
>>> print s.encode('raw_unicode_escape')
\u0103

Is there a way to print certain unicode characters in python terminal from Windows?

Maybe you have wrong escape sequences in your string literals:

import unicodedata   # access to the Unicode Character Database

def check_unicode(s):
print(len(s), s)
for char in s:
print( char, '{:04x}'.format( ord(char)),
unicodedata.category( char),
unicodedata.name( char, '(unknown)') )

Output:

check_unicode( u"\u2b1c\u1f7e8\u1f7e9") # original string literals
5 ⬜὾8὾9
⬜ 2b1c So WHITE LARGE SQUARE
὾ 1f7e Cn (unknown)
8 0038 Nd DIGIT EIGHT
὾ 1f7e Cn (unknown)
9 0039 Nd DIGIT NINE
check_unicode( u"\u2b1c\U0001f7e8\U0001f7e9") # adjusted string literals
3 ⬜br>⬜ 2b1c So WHITE LARGE SQUARE
1f7e8 So LARGE YELLOW SQUARE
1f7e9 So LARGE GREEN SQUARE

Edit. Run in Windows Terminal using default Cascadia Code font…

How to print unicode-character-codes-stored-as-strings as human-readable text

Your string is the unicode codepoint represented in hexdecimal, so the character can be rendered by printing the result of calling chr on the decimal value of the code point.

>>> print(chr(int('3077', 16)))

How to print Unicode like “u{variable}” in Python 2.7?

This is probably not a great way, but it's a start:

>>> x = '00e4'
>>> print unicode(struct.pack("!I", int(x, 16)), 'utf_32_be')
ä

First, we get the integer represented by the hexadecimal string x. We pack that into a byte string, which we can then decode using the utf_32_be encoding.

Since you are doing this a lot, you can precompile the struct:

int2bytes = struct.Struct("!I").pack
with open("someFileWithAListOfUnicodeCodePoints") as fh:
for code_point in fh:
print unicode(int2bytes(int(code_point, 16)), 'utf_32_be')

If you think it's clearer, you can also use the decode method instead of the unicode type directly:

>>> print int2bytes(int('00e4', 16)).decode('utf_32_be')
ä

Python 3 added a to_bytes method to the int class that lets you bypass the struct module:

>>> str(int('00e4', 16).to_bytes(4, 'big'), 'utf_32_be')
"ä"

Printing unicode number of chars in a string (Python)

In a comment you said '\u06FF is what I'm trying to print' - this could also be done using Python's repr function, although you seem pretty happy with hex(ord(c)). This may be useful for someone looking for a way to find an ascii representation of a unicode character, though.

example_string = u'\u063a\u064a\u0646\u064a'

for c in example_string:
print repr(c), c

gives output

u'\u063a' غ
u'\u064a' ي
u'\u0646' ن
u'\u064a' ي

If you want to strip out the Python unicode literal part, you can quite simply do

for c in example_string:
print repr(c)[2:-1], c

to get the output

\u063a غ
\u064a ي
\u0646 ن
\u064a ي

Python Jupyter Notebook: How to print unicode characters from wikipedia hex value (like U+1F0A1)?

There is another type of escape code (capital U) that requires eight digits:

>>> print('\U0001F0A1')
br>

You can also print by converting a number:

>>> chr(0x1f0a1)
''
>>> print(chr(0x1f0a1))
br>

So you can programmatically generate a 52-card desk as:

>>> suit = 0x1f0a0,0x1f0b0,0x1f0c0,0x1f0d0
>>> rank = 1,2,3,4,5,6,7,8,9,10,11,13,14
>>> for s in suit:
... for r in rank:
... print(chr(s+r),end='')
... print()
...
br>br>br>br>

How do I print unicode with 5 characters

\u must be a 4-digit code. \U uses 8-digit codes:

>>> print('\U0001F389')
br>

You can also use chr:

>>> print(chr(0x1f389))
br>

Or look up and use by name:

>>> import unicodedata as ud
>>> ud.name(chr(0x1f389))
'PARTY POPPER'
>>> print('\N{PARTY POPPER}')
br>

Printing all unicode characters in Python

You're trying to format a Unicode character into a byte string. You can remove the error by using a Unicode string instead:

print u"{}: {}".format(code,eval(expression))
^

The other answers are better at simplifying the original problem however, you're definitely doing things the hard way.



Related Topics



Leave a reply



Submit