Python & MySQL: Unicode and Encoding

Python & MySql: Unicode and Encoding

I think that your MYSQLdb python library doesn't know it's supposed to encode to utf8, and is encoding to the default python system-defined charset latin1.

When you connect() to your database, pass the charset='utf8' parameter. This should also make a manual SET NAMES or SET character_set_client unnecessary.

Encoding issue in python using mysql connector python

  1. make sure that in your table you have specified a ascii encoding

Example:

CREATE TABLE t1(
col1 char,
....,
....,
....
)Engine=InnoDB charset=ascii;

  1. In the MySQL-Python connector specify assci encoding and enable unicode

db = MySQLdb.connect(
host="localhost",
port=3306,
user="john",
passwd="megajonhy",
db="jonhydb",
use_unicode=True,
charset='ascii'
)

I believe that by default use_unicode is set to False, and thus by setting it True you will fix the issue.

Writing UTF-8 String to MySQL with Python

I found the solution to my problems. Decoding the String with .decode('unicode_escape').encode('iso8859-1').decode('utf8') did work at last. Now everything is inserted as it should. The full other solution can be found here: Working with unicode encoded Strings from Active Directory via python-ldap

Python unicode encoding issue

Try:

con = mdb.connect('loclhost', 'root', '', 'mydb', 
use_unicode=True, charset='utf8')

Here is a demonstration showing that it works:

If you do not use use_unicode=True with the following setup, you get a UnicodeEncodeError:

import MySQLdb
import config

def setup_charset(cursor, typ='latin1'):
sql = 'DROP TABLE IF EXISTS foo'
cursor.execute(sql)
sql = '''\
CREATE TABLE `foo` (
`fooid` int(11) NOT NULL AUTO_INCREMENT,
`bar` varchar(30),
`baz` varchar(30),
PRIMARY KEY (`fooid`)) DEFAULT CHARSET={t}
'''.format(t=typ)
cursor.execute(sql)
sql = 'INSERT INTO foo (bar,baz) VALUES (%s,%s)'

connection = MySQLdb.connect(
host=config.HOST, user=config.USER,
passwd=config.PASS, db='test')

cursor = connection.cursor()
setup_charset(cursor, typ='utf8')
sql = u'INSERT INTO foo (bar,baz) VALUES (%s,%s)'
try:
cursor.execute(sql, [u'José Beiträge', u'∞'])
except UnicodeEncodeError as err:
# You get this error if you don't use
# (use_unicode=True, charset='utf8') see below.
print(err)

raises the exception:

'latin-1' codec can't encode character u'\u221e' in position 0: ordinal not in range(256)

While, if you do use use_unicode=True, you can insert unicode with no error:

connection = MySQLdb.connect(
host=config.HOST, user=config.USER,
passwd=config.PASS, db='test',
use_unicode=True,
charset='utf8')
cursor = connection.cursor()
cursor.execute(sql, ['José Beiträge', '∞'])
cursor.execute('SELECT * from foo')
for row in cursor:
print(u'{} {}'.format(*row[1:]))

prints

José Beiträge ∞

mysql in python encoding

add these parameters MySQLdb.connect(..., use_unicode=1,charset="utf8").

create a cursor

cur = db.cursor()

and then execute like so:

risk = m['Text']
sql = """INSERT INTO posts(nmbr, msg, tel, sts) \
VALUES (%s, %s, %s, %s)"""
values = (number, risk, 'smart', 'u')
cur.execute(sql,values) #use comma to separate sql and values, this will ensure values are escaped/sanitized
cur.commit()

now you dont need these two lines:

msg = risk.encode('utf8')
text = db.escape_string(msg)

UnicodeEncodeError when inserting Chinese characters into mysql in python

Does that Python construct add quotes when doing the substitution? It needs to.

Did you establish utf8mb4 for the connection?

Is the table/column CHARACTER SET utf8mb4?

More Python notes

I suggest utf8mb4 instead of utf8 because Chinese has some characters that need 4 bytes.



Related Topics



Leave a reply



Submit