How to Recursively Find Specific Key in Nested JSON

How to recursively find specific key in nested JSON?

def id_generator(dict_var):
for k, v in dict_var.items():
if k == "id":
yield v
elif isinstance(v, dict):
for id_val in id_generator(v):
yield id_val

This will create an iterator which will yield every value on any level under key "id". Example usage (printing all of those values):

for _ in id_generator(some_json_dict):
print(_)

Extracting nested json keys recursively

Something like this:

#! /usr/bin/python3

json = {'inquiry_date': '2021-01-14',
'address': {'city': 'Warsaw',
'zip_code': '20-200',
'country': 'Poland',
'house_no': '22',
'street': 'Some-Street'},
'insert_date': '2020-12-20',
'is_active': False}

def parse_json(json,parents,n):
for k, v in json.items():
if isinstance(v, dict):
parse_json(v,parents+" "+k, n+1)
else:
print(parents, k, v)

parse_json(json,"",0)

gives:

 inquiry_date 2021-01-14
address city Warsaw
address zip_code 20-200
address country Poland
address house_no 22
address street Some-Street
insert_date 2020-12-20
is_active False

Printout specific keys and values in nested .json data recursively in python

Iterative approach: storing items to be processed in a data structure

A recursive function is one possibility; another possibility is to keep a data structure holding all the dictionaries you've encountered so far but haven't yet processed; then, as long as you still have at least one unprocessed dictionary, process it; if it has children, add them to the structure.
Here I use a simple python list:

names, sizes, mountpoints = [],[],[]
to_be_processed = data['blockdevices']
while to_be_processed:
d = to_be_processed.pop()
names.append(d['name'])
sizes.append(d['size'])
mountpoints.append(d['mountpoint'])
if 'children' in d:
to_be_processed.extend(d['children'])

Preserving order: using a FIFO instead of a LIFO data structure

Note that the iterative code provided above uses a python list with its methods .extend() and .pop(). Effectively, this uses the python list as a LIFO, a Last-In-First-Out data structure. If you want to preserve the order of your data, you want to use a FIFO, a First-In-First-Out data structure, instead. You could replace .pop() with .pop(0) to remove the first element instead of the last, but note that list.pop(0) is not an efficient operation in python; it requires copying all elements from the list. Instead, we can use a collections.deque object, with its .extend() and .popleft() methods:

  • documentation on collections.deque
import collections

names, sizes, mountpoints = [],[],[]
to_be_processed = collections.deque(data['blockdevices'])
while to_be_processed:
d = to_be_processed.popleft()
names.append(d['name'])
sizes.append(d['size'])
mountpoints.append(d['mountpoint'])
if 'children' in d:
to_be_processed.extend(d['children'])

Recursive approach

A recursive approach is possible too. Call it on data['blockdevices'], and have it make a recursive call on the children if there are any.

def process_data(d_list, names, sizes, mountpoints):
for d in d_list:
names.append(d['name'])
sizes.append(d['size'])
mountpoints.append(d['mountpoint'])
if 'children' in d:
process_data(d['children'], names, sizes, mountpoints)
return names, sizes, mountpoints

process_data(data['blockdevices'], [], [], [])

Recursively find and return key and value from nested dictionaries python

It's worth pointing out you have a bug here:

for _ in item_generator(d,data_name):
return (_)

This is an important case to be aware of, because the return statement here only returns once. Therefore, this for loop only runs for the first iteration, and only returns the first yield result - i.e. only the first occurrence of the lookup key in the json_data.

You can fix it using generator (or iterable) unpacking into a list, as below:

def get_data_value(data, data_name):
d = data['test']
return [*item_generator(d, data_name)]

def item_generator(json_input, lookup_key):
if isinstance(json_input, dict):
if lookup_key in json_input:
yield {lookup_key: json_input[lookup_key]}
else:
for v in json_input.values():
yield from item_generator(v, lookup_key)

elif isinstance(json_input, list):
for item in json_input:
yield from item_generator(item, lookup_key)

json_data = {"test": [{"Tier1": [{"Tier1-Main-Title-1": [{"title": "main", "example": 400}]}]}, {"Tier2": []},
{"Tier3": [{"Example1-Sub1": 44, "title": "TEST2"}]}]}

print(get_data_value(json_data, 'title'))

Result:

[{'title': 'main'}, {'title': 'TEST2'}]

Or, if you'd prefer not to call get_data_value at all:

print(*item_generator(json_data['test'], 'title'))

Where passing the key 'test' is optional, thanks to the function being recursive by nature.

The results are separated by a single space by default, but you can control the separator by passing the sep parameter to the print statement.

{'title': 'main'} {'title': 'TEST2'}

How to return specific key value pair from nested json without knowing location?

The underlying question is: how can we make multiple recursive calls in a loop, return the recursive result if any of them returns something useful, and fail otherwise?

If we blindly return inside the loop, then only one recursive call can be made. Whatever it returns, gets returned at this level. If it didn't find the useful result, we don't get a useful result.

If we blindly don't return inside the loop, then the values that were returned don't matter. Nothing in the current call makes use of them, so we will finish looping, make all the recursive calls, reach the end of the function... and thus implicitly return None.

The way around this, of course, is to check whether the recursive call returned something useful. If it did, we can return that; otherwise, we keep going. If we reach the end, then we signal that we couldn't find anything useful - that way, if we are being recursively called, the caller can do the right thing.

Assuming that None cannot be a "useful" value, we can naturally use that as the signal. We don't even have to return it explicitly at the end.

After fixing some other typos (we should not overwrite the global built-in dict name, and anyway we don't need to name the dict that we pass in at the start, and the parameter should be m_dict so that it's properly defined when we make the recursive call), we get:

def recursive_json(data, attr, m_dict):
for k,v in data.items():
if k == attr:
for k2,v2 in v.items():
m_dict = {attr, v2}
print('IF: ', m_dict)
return m_dict
elif isinstance(v,dict):
result = recursive_json(v, attr, m_dict)
if result:
return result

# call it:
recursive_json(json_data, "Date", {})

We can see that the debug trace is printed, and the value is also returned.

Let's improve this a bit:

First off, the inner for k2,v2 in v.items(): loop doesn't make any sense. Again, we can only return once per call, so this would skip any values in the dict after the first. We would be better served just returning v directly. Also, the m_dict parameter doesn't actually help implement the logic; we don't modify it between calls. It doesn't make sense to use a set for our return value, since it's fundamentally unordered; we care about the order here. Finally, we don't need the debug trace any more. That gives us:

def recursive_json(data, attr):
for k, v in data.items():
if k == attr:
return attr, v
elif isinstance(v,dict):
result = recursive_json(v, attr)
if result:
return result

To get fancier, we can separate the base case from the recursive case, and use more elegant tools for each. To check if any of the keys matches, we can simply check with the in operator. To recurse and return the first fruitful result, the built-in next is useful. We get:

def recursive_json(data, attr):
if not isinstance(data, dict):
# reached a leaf, can't search in here.
return None
if attr in data:
return k, data[k]
candidates = (recursive_json(v, attr) for v in data.values())
try:
# the first non-None candidate, if any.
return next(c for c in candidates if c is not None)
except StopIteration:
return None # all candidates were None.

Need help filtering nested json object recursively

By adapting the lovely function iterate we can easily iterate a tree. On the way we collect all those with the searched status.

This solution is the same as the others. Only easier to find.

const data={item1:{"item1.1":{"item1.1.1":{"item1.1.1.1":{attr1:[],attr2:"",attr3:[],status:"ERROR"}}},"item1.2":{"item1.2.1":{"item1.2.1.1":{attr1:[],attr2:"",attr3:[],status:"WARNING"}}}},item2:{"item2.1":{"item2.1.1":{"item2.1.1.1":{attr1:[],attr2:"",attr3:[],status:"WARNING"}},"item2.1.2":{"item2.1.2.1":{attr1:[],attr2:"",attr3:[],status:"OK"},"item2.1.2.2":{attr1:[],attr2:"",attr3:[],status:"WARNING"}}}},item3:{"item3.1":{"item3.1.1":{"item3.1.1.1":{attr1:[],attr2:"",attr3:[],status:"OK"}},"item3.1.2":{attr1:[],attr2:"",attr3:[],status:"ERROR"}}}};

function getStatuses(data, status) {

var result = {}

const iterate = (obj) => {
if (!obj) {
return;
}
Object.keys(obj).forEach(key => {
var value = obj[key]
if (typeof value === "object" && value !== null) {
iterate(value)
if (value.status == status) {
result[key] = value;
}
}
})
}

iterate(data)
return result;
}
console.log (getStatuses(data,"ERROR"))

Python recursive function to match key values and return path in nested dictionary

You can simplify your recursion by just checking whether the current key matches the search_value or if not, if the associated value is a dict, in which case you recurse:

def getpath(nested_dict, search_value):
# loop through dict keys
for key in nested_dict.keys():
# have we found the search value?
if key == search_value:
return [key]
# if not, search deeper if this value is a dict
if type(nested_dict[key]) is dict:
path = getpath(nested_dict[key], search_value)
if path is not None:
return [key] + path
# no match found in this part of the dict
return None

getpath(nested_dict,'s') # ['k','p','s']
getpath(nested_dict, 'i') # ['a', 'g','h','i']

python find and replace all values under a certain key in a nested dictionary

You can use recursion. You'll need to supply logic for diving into each container type you care about supporting. As an example, this is how you could handle values in nested dicts or lists.

def replace_nested_key(data, key, value):
if isinstance(data, dict):
return {
k: value if k == key else replace_nested_key(v, key, value)
for k, v in data.items()
}
elif isinstance(data, list):
return [replace_nested_key(v, key, value) for v in data]
else:
return data

Get the Parent key and the nested value in nested json

After reading through the comments and the answers. I got this solution working for my use case.

def parse_schema_permission_info(schema):
x_fields = {}

def extract_permission_field(field, parent_field):
for field, value in field.items():
if field == 'x-permission':
x_fields.update({parent_field: value})
if isinstance(value, dict):
key = parent_field + '.' + field
if value.get('x-permission'):
x_fields.update(
{key: value.get('x-permission')}
)
extract_permission_field(value, key)

for field in schema:
extract_permission_field(schema.get(field), field)

return x_fields


Related Topics



Leave a reply



Submit