How to extract multiple JSON objects from one file?
Use a json array, in the format:
[
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]},
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]},
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]},
...
]
Then import it into your python code
import json
with open('file.json') as json_file:
data = json.load(json_file)
Now the content of data is an array with dictionaries representing each of the elements.
You can access it easily, i.e:
data[0]["ID"]
How to extract multiple independently nested JSON objects and keys from a website using Python
IIUC, you just need to call the .find_all
method in your spanList
to get all the json objects.
Try this:
from bs4 import BeautifulSoup
import requests
import json
reno = 'https://www.foodpantries.org/ci/nv-reno'
renoContent = requests.get(reno)
renoHtml = BeautifulSoup(renoContent.text, 'html.parser')
json_scripts = renoHtml.find("div", class_="span8").find_all('script', type='application/ld+json')
data = [json.loads(script.text, strict=False) for script in json_scripts]
#use strict=False to bypass json.decoder.JSONDecodeError: Invalid control character
print(data)
Extract Data from multiple objects in JSON variable in SQL
You need a combination of:
OPENJSON()
with default schema. The result is a table with columnskey
,value
andtype
. Thekey
column contains the key of each nested JSON object, thevalue
column contains the value of each nested JSON object.OPENJSON()
with explicit schema (theWITH
clause) and an additionalAPPLY
operator to parse the nested JSON objects from the firstOPEBNJSON()
call using the defined output schema:
SELECT j2.*
FROM OPENJSON(@json, '$.result') j1
OUTER APPLY OPENJSON(j1.[value]) WITH (
id nvarchar(50) '$.management_account_id',
lbl nvarchar(50) '$.management_account_label'
) j2
Result:
id lbl
-----------------------------------------
6828 EXC001-00-GP Excellerate Facilities
12183 ENF001-04-GP The Zone
How to read multiple nested json objects in one file extract by pyspark to dataframe in Azure databricks?
- You can read it into an RDD first. It will be read as a list of strings
- You need to convert the json string into a native python datatype using
json.loads()
- Then you can convert the RDD into a dataframe, and it can infer the schema directly using
toDF()
- Using the answer from Flatten Spark Dataframe column of map/dictionary into multiple columns, you can explode the
Data
column into multiple columns. Given yourId
column is going to be unique. Note that, explode would returnkey
,value
columns for each entry in the map type. - You can repeat the 4th point to explode the
properties
column.
Solution:
import json
rdd = sc.textFile("demo_files/Test20191023.log")
df = rdd.map(lambda x: json.loads(x)).toDF()
df.show()
# +--------------------+----------+--------------------+----------+
# | Data| EventType| Id| Timestamp|
# +--------------------+----------+--------------------+----------+
# |[MessageTemplate ...|3735091736|event-c20b9c7eac0...|2019-03-19|
# |[MessageTemplate ...|3735091737|event-d20b9c7eac0...|2019-03-18|
# |[MessageTemplate ...|3735091738|event-e20b9c7eac0...|2019-03-17|
# +--------------------+----------+--------------------+----------+
data_exploded = df.select('Id', 'EventType', "Timestamp", F.explode('Data'))\
.groupBy('Id', 'EventType', "Timestamp").pivot('key').agg(F.first('value'))
# There is a duplicate Id column and might cause ambiguity problems
data_exploded.show()
# +--------------------+----------+----------+--------+-----+---------------+--------------------+
# | Id| EventType| Timestamp| Id|Level|MessageTemplate| Properties|
# +--------------------+----------+----------+--------+-----+---------------+--------------------+
# |event-c20b9c7eac0...|3735091736|2019-03-19|event-c2| 2| Test1|{CorrId=d69b7489,...|
# |event-d20b9c7eac0...|3735091737|2019-03-18|event-d2| 2| Test1|{CorrId=f69b7489,...|
# |event-e20b9c7eac0...|3735091738|2019-03-17|event-e2| 1| Test1|{CorrId=g69b7489,...|
# +--------------------+----------+----------+--------+-----+---------------+--------------------+
Related Topics
Why Doesn't Pygame Draw in the Window Before the Delay or Sleep
Apply Pandas Function to Column to Create Multiple New Columns
Getting List of Parameter Names Inside Python Function
How to Find an Element That Contains Specific Text in Selenium Webdriver (Python)
Finding All Possible Permutations of a Given String in Python
Rank Items in an Array Using Python/Numpy, Without Sorting Array Twice
Error "Filename.Whl Is Not a Supported Wheel on This Platform"
How to Capture Stdout Output from a Python Function Call
How to Print Pandas Dataframe Without Index
Django. Override Save for Model
Resetting Generator Object in Python
Django Urls Typeerror: View Must Be a Callable or a List/Tuple in the Case of Include()
Hash Function in Python 3.3 Returns Different Results Between Sessions
Random.Seed(): What Does It Do
In Tkinter How to Make a Widget Invisible
How to Remove \Xa0 from String in Python
How to Detach Matplotlib Plots So That the Computation Can Continue