Python: json.loads returns items prefixing with 'u'
The u- prefix just means that you have a Unicode string. When you really use the string, it won't appear in your data. Don't be thrown by the printed output.
For example, try this:
print mail_accounts[0]["i"]
You won't see a u.
How to get string objects instead of Unicode from JSON
A solution with object_hook
It works for both Python 2.7 and 3.x.
import json
def json_load_byteified(file_handle):
return _byteify(
json.load(file_handle, object_hook=_byteify),
ignore_dicts=True
)
def json_loads_byteified(json_text):
return _byteify(
json.loads(json_text, object_hook=_byteify),
ignore_dicts=True
)
def _byteify(data, ignore_dicts = False):
if isinstance(data, str):
return data
# If this is a list of values, return list of byteified values
if isinstance(data, list):
return [ _byteify(item, ignore_dicts=True) for item in data ]
# If this is a dictionary, return dictionary of byteified keys and values
# but only if we haven't already byteified it
if isinstance(data, dict) and not ignore_dicts:
return {
_byteify(key, ignore_dicts=True): _byteify(value, ignore_dicts=True)
for key, value in data.items() # changed to .items() for Python 2.7/3
}
# Python 3 compatible duck-typing
# If this is a Unicode string, return its string representation
if str(type(data)) == "<type 'unicode'>":
return data.encode('utf-8')
# If it's anything else, return it in its original form
return data
Example usage:
>>> json_loads_byteified('{"Hello": "World"}')
{'Hello': 'World'}
>>> json_loads_byteified('"I am a top-level string"')
'I am a top-level string'
>>> json_loads_byteified('7')
7
>>> json_loads_byteified('["I am inside a list"]')
['I am inside a list']
>>> json_loads_byteified('[[[[[[[["I am inside a big nest of lists"]]]]]]]]')
[[[[[[[['I am inside a big nest of lists']]]]]]]]
>>> json_loads_byteified('{"foo": "bar", "things": [7, {"qux": "baz", "moo": {"cow": ["milk"]}}]}')
{'things': [7, {'qux': 'baz', 'moo': {'cow': ['milk']}}], 'foo': 'bar'}
>>> json_load_byteified(open('somefile.json'))
{'more json': 'from a file'}
How does this work and why would I use it?
Mark Amery's function is shorter and clearer than these ones, so what's the point of them? Why would you want to use them?
Purely for performance. Mark's answer decodes the JSON text fully first with Unicode strings, then recurses through the entire decoded value to convert all strings to byte strings. This has a couple of undesirable effects:
- A copy of the entire decoded structure gets created in memory
- If your JSON object is really deeply nested (500 levels or more) then you'll hit Python's maximum recursion depth
This answer mitigates both of those performance issues by using the object_hook
parameter of json.load
and json.loads
. From the documentation:
object_hook
is an optional function that will be called with the result of any object literal decoded (adict
). The return value of object_hook will be used instead of thedict
. This feature can be used to implement custom decoders
Since dictionaries nested many levels deep in other dictionaries get passed to object_hook
as they're decoded, we can byteify any strings or lists inside them at that point and avoid the need for deep recursion later.
Mark's answer isn't suitable for use as an object_hook
as it stands, because it recurses into nested dictionaries. We prevent that recursion in this answer with the ignore_dicts
parameter to _byteify
, which gets passed to it at all times except when object_hook
passes it a new dict
to byteify. The ignore_dicts
flag tells _byteify
to ignore dict
s since they already been byteified.
Finally, our implementations of json_load_byteified
and json_loads_byteified
call _byteify
(with ignore_dicts=True
) on the result returned from json.load
or json.loads
to handle the case where the JSON text being decoded doesn't have a dict
at the top level.
How to remove unicode characters from Dictionary data in python
Change your None to 'None':
c = {u'xyz': {u'key1': 'None', u'key2': u'Value2'}}
it is a casting issue - ast likes str's
Also, maybe u want to change all None to empty str or 'None' str...
See this thread :
Python: most idiomatic way to convert None to empty string?
with this code, i've changes the empty string to 'None':
def xstr(s):
if s is None:
return 'None'
return str(s)
Python requests response print in JSON format U' is added to results
You can use encode()
method to convert unicode strings to ascii strings like this :
import requests
import json
headers={
"accept": "application/json",
"content-type": "application/json"
}
test_urls = ['http://www.mocky.io/v2/5185415ba171ea3a00704eed']
def return_json(url):
try:
response = requests.get(url,headers=headers)
# Consider any status other than 2xx an error
if not response.status_code // 100 == 2:
return "Error: Unexpected response {}".format(response)
json_obj = response.json()
return json_obj
except requests.exceptions.RequestException as e:
# A serious problem happened, like an SSLError or InvalidURL
return "Error: {}".format(e)
for url in test_urls:
print("Fetching URL '{0}'".format(url))
ret = return_json(url)
ret = {unicode(k).encode('ascii'): unicode(v).encode('ascii') for k, v in ret.iteritems()}
print(ret)
understanding object_pairs_hook in json.loads()
It allows you to customize what objects your JSON will parse into. For this specific argument (object_pairs_hook
) it's for pair (read key/value pairs of a mapping object).
For instance if this string appears in your JSON:
{"var1": "val1", "var2": "val2"}
It will call the function pointed to with the following argument:
[('var1', 'val1'), ('var2', 'val2')]
Whatever the function returns is what will be used in the resulting parsed structure where the above string was.
A trivial example is object_pairs_hook=collections.OrderedDict
which ensures your keys to be ordered the same way as they were they occurred in the incoming string.
The generic idea of a hook is to allow you to register a function that is called (back) as needed for a given task. In this specific case it allows you to customize decoding of (different types of objects in the) incoming JSON string.
Related Topics
Moving Matplotlib Legend Outside of the Axis Makes It Cutoff by the Figure Box
Most Recent Previous Business Day in Python
Df.Append() Is Not Appending to the Dataframe
How to Count Occurrence of Unique Values Inside a List
Pandas - Plotting a Stacked Bar Chart
Weird Try-Except-Else-Finally Behavior with Return Statements
How to Get the Discord.Py Intents to Work
How to Find the Number of Arguments of a Python Function
Turn a String into a Valid Filename
Windows Cmd Encoding Change Causes Python Crash
Seaborn Is Not Plotting Within Defined Subplots
Pandas Select from Dataframe Using Startswith
Python and Pip, List All Versions of a Package That's Available