Python json.loads shows ValueError: Extra data
As you can see in the following example, json.loads
(and json.load
) does not decode multiple json object.
>>> json.loads('{}')
{}
>>> json.loads('{}{}') # == json.loads(json.dumps({}) + json.dumps({}))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 368, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3 - line 1 column 5 (char 2 - 4)
If you want to dump multiple dictionaries, wrap them in a list, dump the list (instead of dumping dictionaries multiple times)
>>> dict1 = {}
>>> dict2 = {}
>>> json.dumps([dict1, dict2])
'[{}, {}]'
>>> json.loads(json.dumps([dict1, dict2]))
[{}, {}]
Can't parse json file: json.decoder.JSONDecodeError: Extra data.
Your JSON data set is not valid , You can merge them into one array of objects.
For example :
[
{
"host": "a.com",
"ip": "1.2.2.3",
"port": 8
}, {
"host": "b.com",
"ip": "2.5.0.4",
"port": 3
}, {
"host": "c.com",
"ip": "9.17.6.7",
"port": 4
}
]
In JSON you can't have multiple objects of top-level but you can have array of objects and it is valid
You can see more JSON Data Set examples if you want in this link
- If you want to know more about JSON arrays you can read in w3schools JSON tutorial
ValueError: Extra Data error when importing json file using python
Figured it out. Looks like breaking it up into lines was the mistake. Here's what the final code looks like.
counter = 0
for jsonFile in jsonFiles:
with open(jsonFile) as f:
data = f.read()
jsondata = json.loads(data)
try:
db[args.collection].insert(jsondata)
counter += 1
Python json.load JSONDecodeError: Extra data error when trying load multiple json dictionaries
In order solving that kind of "multiple"/"invalid" JSON, you can read the entire file, add these brackets []
to encapsulate the string and then load it as string with json.loads()
.
- Read whole file as string, store it into a variable.
- Remove all occurrences of newlines and spaces.
- Add the comma
,
the intersection}{
, so it will be...},{...
. - Encapsulate it with the brackets
[]
. - Use
json.loads()
to parse the JSON string.
Full code:
def read_json_file(file):
with open(file, "r") as r:
response = r.read()
response = response.replace('\n', '')
response = response.replace('}{', '},{')
response = "[" + response + "]"
return json.loads(response)
JSON load function gives Extra Data value error
according to jsonlint:
Error: Parse error on line 27:
...le" }] } ]} { "log": [{ "cod
------------------^
Expecting 'EOF', '}', ',', ']', got '{'
In other words , you have 2 jsons in one, should be a list, as in
[{
"log": [{
"code": "info",
"message": {
"text": "[info] Activation of plug-in abcd rule processor (xule) successful, version Check version using Tools->Xule->Version on the GUI or --xule-version on the command line. - xule "
},
"refs": [{
"href": "xule"
}],
"level": "info"
},
{
"code": "xyz.F1.all.7",
"level": "error",
"message": {
"text": "[xyz.F1.all.7] The value for ForResale with a value of 63 has a unit of utr:MWh. This concept allows units of utr:MWh.\n\nElement : xyz:ForResale\nPeriod : 2016-01-01 to 2016-12-31\nUnit : utr:MWh\n\nRule Id:xyz.F1.all.7 - TestUtilitiesInc-428-2016Q4F1.abcd 4114",
"severity": "error",
"cid": "63096080",
"filing_url": "C:\\Users\\TEST\\Desktop\\TestUtilitiesInc-428-2016Q4F1.abcd"
},
"refs": [{
"href": "xule"
}]
}
]
},
{
"log": [{
"code": "info",
"message": {
"text": "[info] Activation of plug-in abcd rule processor (xule) successful, version Check version using Tools->Xule->Version on the GUI or --xule-version on the command line. - xule "
},
"refs": [{
"href": "xule"
}],
"level": "info"
},
{
"code": "xyz.F1.all.7",
"level": "error",
"message": {
"text": "[xyz.F1.all.7] The value for ForResale with a value of 63 has a unit of utr:MWh. This concept allows units of utr:MWh.\n\nElement : xyz:ForResale\nPeriod : 2016-01-01 to 2016-12-31\nUnit : utr:MWh\n\nRule Id:xyz.F1.all.7 - TestUtilitiesInc-428-2016Q4F1.abcd 4114",
"severity": "error",
"cid": "63096080",
"filing_url": "C:\\Users\\TEST\\Desktop\\TestUtilitiesInc-428-2016Q4F1.abcd"
},
"refs": [{
"href": "xule"
}]
}
]
}
]
TRY SOMETHING LikE:
import json
mylist = []
with open('C:/Users/Desktop/SampleTestFiles/logfile.json', encoding="utf-8") as f:
new_el = ''
for l in f:
new_el += l.rstrip('\n')
try:
sub = json.loads(new_el)
mylist.append(sub)
new_el = ''
except:# json.decoder.JSONDecodeError:
pass
print(mylist)
This prints:
[{u'log': [{u'message': {u'text': u'[info] Activation of plug-in abcd rule processor (xule) successful, version Check version using Tools->Xule->Version on the GUI or --xule-version on the command line. - xule '}, u'code': u'info', u'refs': [{u'href': u'xule'}], u'level': u'info'}, {u'message': {u'text': u'[xyz.F1.all.7] The value for ForResale with a value of 63 has a unit of utr:MWh. This concept allows units of utr:MWh.\n\nElement : xyz:ForResale\nPeriod : 2016-01-01 to 2016-12-31\nUnit : utr:MWh\n\nRule Id:xyz.F1.all.7 - TestUtilitiesInc-428-2016Q4F1.abcd 4114', u'filing_url': u'C:\Users\TEST\Desktop\TestUtilitiesInc-428-2016Q4F1.abcd', u'severity': u'error', u'cid': u'63096080'}, u'code': u'xyz.F1.all.7', u'refs': [{u'href': u'xule'}], u'level': u'error'}]}, {u'log': [{u'message': {u'text': u'[info] Activation of plug-in abcd rule processor (xule) successful, version Check version using Tools->Xule->Version on the GUI or --xule-version on the command line. - xule '}, u'code': u'info', u'refs': [{u'href': u'xule'}], u'level': u'info'}, {u'message': {u'text': u'[xyz.F1.all.7] The value for ForResale with a value of 63 has a unit of utr:MWh. This concept allows units of utr:MWh.\n\nElement : xyz:ForResale\nPeriod : 2016-01-01 to 2016-12-31\nUnit : utr:MWh\n\nRule Id:xyz.F1.all.7 - TestUtilitiesInc-428-2016Q4F1.abcd 4114', u'filing_url': u'C:\Users\TEST\Desktop\TestUtilitiesInc-428-2016Q4F1.abcd', u'severity': u'error', u'cid': u'63096080'}, u'code': u'xyz.F1.all.7', u'refs': [{u'href': u'xule'}], u'level': u'error'}]}]
Reading a JSON file using Python - JSONDecodeError Extra Data
The problem is json.load
does not decode multiple json objects. You'll probably want to place the data in an array. Check out this link for more info
Related Topics
Clicking Links With Python Beautifulsoup
How to Find Rows of One Dataframe in Another Dataframe
Easiest Way to Ignore Blank Lines When Reading a File in Python
Rotate Tick Labels for Seaborn Barplot
How to Pass Variables from Python Script to Bash Script
How to Find Number of Ways That the Integers 1,2,3 Can Add Up to N
Masking User Input in Python With Asterisks
How to Get the Column Name in Pandas Based on Row Values
Convert Spark Dataframe Column to Python List
How to Get Max Output from a While Loop
Finding Out Who Got the Highest Mark Among the Students
How to Read Numbers from File in Python
Importing Local Module (Python Script) in Airflow Dag
Using Selenium in Python to Save a Webpage on Firefox
Easiest Way to Replace a String Using a Dictionary of Replacements