Fastest Way to Convert a Dict's Keys & Values from 'Unicode' to 'Str'

Fastest way to convert a dict's keys & values from `unicode` to `str`?

DATA = { u'spam': u'eggs', u'foo': frozenset([u'Gah!']), u'bar': { u'baz': 97 },
         u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])]}

def convert(data):
    if isinstance(data, basestring):
        return str(data)
    elif isinstance(data, collections.Mapping):
        return dict(map(convert, data.iteritems()))
    elif isinstance(data, collections.Iterable):
        return type(data)(map(convert, data))
    else:
        return data

print DATA
print convert(DATA)
# Prints:
# {u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])], u'foo': frozenset([u'Gah!']), u'bar': {u'baz': 97}, u'spam': u'eggs'}
# {'bar': {'baz': 97}, 'foo': frozenset(['Gah!']), 'list': ['list', (True, 'Maybe'), set(['and', 'a', 'set', 1])], 'spam': 'eggs'}

Assumptions:

You've imported the collections module and can make use of the abstract base classes it provides
You're happy to convert using the default encoding (use data.encode('utf-8') rather than str(data) if you need an explicit encoding).

If you need to support other container types, hopefully it's obvious how to follow the pattern and add cases for them.

Fastest way to convert a dict's keys & values from str to Unicode?

If you're using Python 2.7, you can use a dict comprehension:

unidict = {k.decode('utf8'): v.decode('utf8') for k, v in strdict.items()}

For older versions:

unidict = dict((k.decode('utf8'), v.decode('utf8')) for k, v in strdict.items())

(This assumes your strings are in UTF-8, of course.)

How to convert unicoded dict into a string

No, you will need to convert each item to a string manually, then string the dict, and note that unless the Unicode data happens to all be ASCII, you could run into problems. Making that assumption, you can use a dict comprehension to make it quicker and more concise:

print({str(key): str(value) for key, value in ab.items()})
{'a': 'A', 'c': 'C', 'b': 'B'}

If you are using a version of Python prior to 2.7.3, without dict comprehensions:

dict((str(key), str(value)) for key, value in ab.items())

convert dictionary key and value to unicode

Just use the 'unicode' function:

d = {'firstname' : 'Foo', 'lastname' : 'Bar'}
d = {unicode(k):unicode(v) for k,v in d.items() }

Is there any way to print out a dictionary consisted of unicode strings by converting them to other coding?

Here are two ways.

name = u'\u041d\u0435\u0433\u0440\u043e\u043d\u0438'
surname = u'\u041b\u043e\u043d\u0434\u043e\u043d\u0441\u043a\u0438\u0439'
info = {}
info[name] = surname

#One way
{print(info[k]) for k in info}

#Another way
print(info[name].encode("unicode-escape").decode("unicode-escape"))

Encode keys of dictionaries inside a list from unicode to ascii

First: do you really need to do this? The strings are in Unicode for a reason: you simply can't represent everything in plain ASCII that you can in Unicode. This probably won't be a problem for your dictionary keys 'uid', 'name' and 'pic_small'; but it probably won't be a problem to leave them as Unicode, either. (The 'simplejson' library does not know anything about your data, so it uses Unicode for every string - better safe than sorry.)

Anyway:

In Python, strings cannot be modified. The .encode method does not change the string; it returns a new string that is the encoded version.

What you want to do is produce a new dictionary, which replaces the keys with the encoded keys. We can do this by passing each pair of (encoded key, original value) as *args for the dict constructor.

That looks like:

dict((k.encode('ascii'), v) for (k, v) in original.items())

Similarly, we can use a list comprehension to apply this to every dictionary, and create the new list. (We can modify the list in-place, but this way is cleaner.)

response = simplejson.load(urllib.urlopen(REST_SERVER, data))
# We create the list of modified dictionaries, and re-assign 'response' to it:
response = [
     dict((k.encode('ascii'), v) for (k, v) in original.items()) # the modified version
     for original in response # of each original dictionary.
]
return response

Convert a String representation of a Dictionary to a dictionary

You can use the built-in ast.literal_eval:

>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}

This is safer than using eval. As its own docs say:


>>> help(ast.literal_eval)
Help on function literal_eval in module ast:

literal_eval(node_or_string)
    Safely evaluate an expression node or a string containing a Python
    expression.  The string or node provided may only consist of the following
    Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
    and None.

For example:

>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
    return _convert(node_or_string)
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
    raise ValueError('malformed string')
ValueError: malformed string

Fastest Way to Convert a Dict's Keys & Values from 'Unicode' to 'Str'