Python Unicode Encode Error

Python Unicode Encode Error

Likely, your problem is that you parsed it okay, and now you're trying to print the contents of the XML and you can't because theres some foreign Unicode characters. Try to encode your unicode string as ascii first:

unicodeData.encode('ascii', 'ignore')

the 'ignore' part will tell it to just skip those characters. From the python docs:

>>> # Python 2: u = unichr(40960) + u'abcd' + unichr(1972)
>>> u = chr(40960) + u'abcd' + chr(1972)
>>> u.encode('utf-8')
'\xea\x80\x80abcd\xde\xb4'
>>> u.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128)
>>> u.encode('ascii', 'ignore')
'abcd'
>>> u.encode('ascii', 'replace')
'?abcd?'
>>> u.encode('ascii', 'xmlcharrefreplace')
'ꀀabcd޴'

You might want to read this article: http://www.joelonsoftware.com/articles/Unicode.html, which I found very useful as a basic tutorial on what's going on. After the read, you'll stop feeling like you're just guessing what commands to use (or at least that happened to me).

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

Read the Python Unicode HOWTO. This error is the very first example.

Do not use str() to convert from unicode to encoded text / bytes.

Instead, use .encode() to encode the string:

p.agent_info = u' '.join((agent_contact, agent_telno)).encode('utf-8').strip()

or work entirely in unicode.

how to fix UnicodeEncodeError:?

Use the csv module to manage CSV files, and use utf-8-sig for Excel to recognize UTF-8 properly. Make sure to use newline='' per the csv documentation when opening the file as well.

Example:

import csv

filename = 'AM4.csv'
with open(filename,'w',newline='',encoding='utf-8-sig') as f:
w = csv.writer(f)
w.writerow(['Motherboard_Name','Price'])
name = 'some name'
price = '\u20b95,99'
w.writerow([name,price.replace(',','')])

Excel image

UnicodeEncodeError in Python

PANDAS is tripping up on handling Unicode data, presumably in generating a CSV output file.

One approach, if you don't really need to process Unicode data, is to simply make conversions on your data to get everything ASCII.

Another approach is to make a pass on your data prior to generating the CSV output file to get the UTF-8 encoding of any non-ASCII characters. (You may need to do this at the cell level of your spreadsheet data.)

I'm assuming Python3 here...

>>> s = "one, two, three, \u2026"
>>> print(s)
one, two, three, …
>>> ascii = str(s.encode("utf-8"))[2:-1]
>>> ascii
'one, two, three, \\xe2\\x80\\xa6'
>>> print(ascii)
one, two, three, \xe2\x80\xa6

See also: help() on codecs module.

UnicodeEncodeError when receiving emoji unicode in JSON

The problem is only at the print statement, and is caused by your Windows system using a cp1252 encoding. That means that the previous Spotify call did correctly its job, and retrieved unicode characters absent from the 1252 code page.

A quick fix is to control the conversion before printing by encoding with errors='replace' and decoding back:

print(str(playlist_data).encode('cp1252', errors='replace').decode('cp1252'))

Any non cp1252 character will appear as a ?.

This could be used for any other encoding by replacing 'cp1252' by the appropriate encoding name.



Related Topics



Leave a reply



Submit