Fastest way to convert a dict's keys & values from `unicode` to `str`?
DATA = { u'spam': u'eggs', u'foo': frozenset([u'Gah!']), u'bar': { u'baz': 97 },
u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])]}
def convert(data):
if isinstance(data, basestring):
return str(data)
elif isinstance(data, collections.Mapping):
return dict(map(convert, data.iteritems()))
elif isinstance(data, collections.Iterable):
return type(data)(map(convert, data))
else:
return data
print DATA
print convert(DATA)
# Prints:
# {u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])], u'foo': frozenset([u'Gah!']), u'bar': {u'baz': 97}, u'spam': u'eggs'}
# {'bar': {'baz': 97}, 'foo': frozenset(['Gah!']), 'list': ['list', (True, 'Maybe'), set(['and', 'a', 'set', 1])], 'spam': 'eggs'}
Assumptions:
- You've imported the collections module and can make use of the abstract base classes it provides
- You're happy to convert using the default encoding (use
data.encode('utf-8')
rather thanstr(data)
if you need an explicit encoding).
If you need to support other container types, hopefully it's obvious how to follow the pattern and add cases for them.
Fastest way to convert a dict's keys & values from str to Unicode?
If you're using Python 2.7, you can use a dict comprehension:
unidict = {k.decode('utf8'): v.decode('utf8') for k, v in strdict.items()}
For older versions:
unidict = dict((k.decode('utf8'), v.decode('utf8')) for k, v in strdict.items())
(This assumes your strings are in UTF-8, of course.)
How to convert unicoded dict into a string
No, you will need to convert each item to a string manually, then string the dict, and note that unless the Unicode data happens to all be ASCII, you could run into problems. Making that assumption, you can use a dict comprehension to make it quicker and more concise:
print({str(key): str(value) for key, value in ab.items()})
{'a': 'A', 'c': 'C', 'b': 'B'}
If you are using a version of Python prior to 2.7.3, without dict comprehensions:
dict((str(key), str(value)) for key, value in ab.items())
convert dictionary key and value to unicode
Just use the 'unicode' function:
d = {'firstname' : 'Foo', 'lastname' : 'Bar'}
d = {unicode(k):unicode(v) for k,v in d.items() }
Is there any way to print out a dictionary consisted of unicode strings by converting them to other coding?
Here are two ways.
name = u'\u041d\u0435\u0433\u0440\u043e\u043d\u0438'
surname = u'\u041b\u043e\u043d\u0434\u043e\u043d\u0441\u043a\u0438\u0439'
info = {}
info[name] = surname
#One way
{print(info[k]) for k in info}
#Another way
print(info[name].encode("unicode-escape").decode("unicode-escape"))
Encode keys of dictionaries inside a list from unicode to ascii
First: do you really need to do this? The strings are in Unicode for a reason: you simply can't represent everything in plain ASCII that you can in Unicode. This probably won't be a problem for your dictionary keys 'uid', 'name' and 'pic_small'; but it probably won't be a problem to leave them as Unicode, either. (The 'simplejson' library does not know anything about your data, so it uses Unicode for every string - better safe than sorry.)
Anyway:
In Python, strings cannot be modified. The .encode
method does not change the string; it returns a new string that is the encoded version.
What you want to do is produce a new dictionary, which replaces the keys with the encoded keys. We can do this by passing each pair of (encoded key, original value) as *args for the dict constructor.
That looks like:
dict((k.encode('ascii'), v) for (k, v) in original.items())
Similarly, we can use a list comprehension to apply this to every dictionary, and create the new list. (We can modify the list in-place, but this way is cleaner.)
response = simplejson.load(urllib.urlopen(REST_SERVER, data))
# We create the list of modified dictionaries, and re-assign 'response' to it:
response = [
dict((k.encode('ascii'), v) for (k, v) in original.items()) # the modified version
for original in response # of each original dictionary.
]
return response
Convert a String representation of a Dictionary to a dictionary
You can use the built-in ast.literal_eval
:
>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}
This is safer than using eval
. As its own docs say:
>>> help(ast.literal_eval)
Help on function literal_eval in module ast:
literal_eval(node_or_string)
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None.
For example:
>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
onerror(os.listdir, path, sys.exc_info())
File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
return _convert(node_or_string)
File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
raise ValueError('malformed string')
ValueError: malformed string
Related Topics
Which Is More Preferable to Use: Lambda Functions or Nested Functions ('Def')
Sqlalchemy Create_All() Does Not Create Tables
Python - Typeerror: 'Int' Object Is Not Iterable
Plot a Histogram Such That Bar Heights Sum to 1 (Probability)
How to Dynamically Add/Remove Periodic Tasks to Celery (Celerybeat)
How to Parse a Website Using Selenium and Beautifulsoup in Python
A Fast Way to Find the Largest N Elements in an Numpy Array
Python Accessing Nested JSON Data
How to Stop Flask from Initialising Twice in Debug Mode
What Determines Which Strings Are Interned and When
How to Check If a Value Is in the List in Selection from Pandas Data Frame