Read and Write CSV Files Including Unicode with Python 2.7

how to write a unicode csv in Python 2.7

You are passing bytestrings containing non-ASCII data in, and these are being decoded to Unicode using the default codec at this line:

self.writer.writerow([unicode(s).encode("utf-8") for s in row])

unicode(bytestring) with data that cannot be decoded as ASCII fails:

>>> unicode('\xef\xbb\xbft_11651497')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)

Decode the data to Unicode before passing it to the writer:

row = [v.decode('utf8') if isinstance(v, str) else v for v in row]

This assumes that your bytestring values contain UTF-8 data instead. If you have a mix of encodings, try to decode to Unicode at the point of origin; where your program first sourced the data. You really want to do so anyway, regardless of where the data came from or if it already was encoded to UTF-8 as well.

Python2.7 Write unicode dictionary into a csv file

It would be trivial if you were using Python3 that natively uses Unicode:

import csv

with open("file.csv", "w", newline='', encoding='utf8') as fd:
dw = DictWriter(fd, data.keys()
dw.writeheader()
dw.writerow(data)

As you prefixed your unicode strings with u, I assume that you use Python2. The csv module is great as processing csv files, but the Python2 version does not natively process Unicode strings. To process a unicode dict, you can just encode its keys and values in utf8:

import csv

utf8data = { k.encode('utf8'): v.encode('utf8') for (k,v) in data.iteritems() }
with open("file.csv", "wb") as fd:
dw = DictWriter(fd, utf8data.keys()
dw.writeheader()
dw.writerow(utf8data)

How to write UTF-8 in a CSV file

It's very simple for Python 3.x (docs).

import csv

with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
writer = csv.writer(csv_file, delimiter=';')
writer.writerow('my_utf8_string')

For Python 2.x, look here.



Related Topics



Leave a reply



Submit