Finding a Key Recursively in a Dictionary

Finding a key recursively in a dictionary

when you recurse, you need to return the result of _finditem

def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
return _finditem(v, key) #added return statement

To fix the actual algorithm, you need to realize that _finditem returns None if it didn't find anything, so you need to check that explicitly to prevent an early return:

def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = _finditem(v, key)
if item is not None:
return item

Of course, that will fail if you have None values in any of your dictionaries. In that case, you could set up a sentinel object() for this function and return that in the case that you don't find anything -- Then you can check against the sentinel to know if you found something or not.

Find key recursively in dictionary and then list all parent keys

Try adding the path as an optional parameter:

def list_parents(obj, key, path=[]):
if key in obj:
path.append(key)
return path

for k, v in obj.items():
if isinstance(v, dict):
found = list_parents(v, key, path=path + [k])
if found:
return found

return None

keys = ["A", "E", "G"]
for key in keys:
res = list_parents({"B": {"A": 2}, "C": 1, "D": {"E": 3, "F": {"G": 3}}, }, key)
print(res)

Output

['B', 'A']
['D', 'E']
['D', 'F', 'G']

Or as an alternative:

def list_parents(obj, key):
if key in obj:
return [key]

for k, v in obj.items():
if isinstance(v, dict):
found = list_parents(v, key)
if found:
return [k, *found]

return None

To improve the complexity of the above approach you could use a deque:

from collections import deque

def list_parents(obj, key):
if key in obj:
return deque([key])

for k, v in obj.items():
if isinstance(v, dict):
found = list_parents(v, key)
if found:
found.appendleft(k)
return found
return None

The reason to use a deque is that inserting to the front of a list (the line [k, *found]) is O(n) vs O(1) in a deque.

Recursively find and return key and value from nested dictionaries python

It's worth pointing out you have a bug here:

for _ in item_generator(d,data_name):
return (_)

This is an important case to be aware of, because the return statement here only returns once. Therefore, this for loop only runs for the first iteration, and only returns the first yield result - i.e. only the first occurrence of the lookup key in the json_data.

You can fix it using generator (or iterable) unpacking into a list, as below:

def get_data_value(data, data_name):
d = data['test']
return [*item_generator(d, data_name)]

def item_generator(json_input, lookup_key):
if isinstance(json_input, dict):
if lookup_key in json_input:
yield {lookup_key: json_input[lookup_key]}
else:
for v in json_input.values():
yield from item_generator(v, lookup_key)

elif isinstance(json_input, list):
for item in json_input:
yield from item_generator(item, lookup_key)

json_data = {"test": [{"Tier1": [{"Tier1-Main-Title-1": [{"title": "main", "example": 400}]}]}, {"Tier2": []},
{"Tier3": [{"Example1-Sub1": 44, "title": "TEST2"}]}]}

print(get_data_value(json_data, 'title'))

Result:

[{'title': 'main'}, {'title': 'TEST2'}]

Or, if you'd prefer not to call get_data_value at all:

print(*item_generator(json_data['test'], 'title'))

Where passing the key 'test' is optional, thanks to the function being recursive by nature.

The results are separated by a single space by default, but you can control the separator by passing the sep parameter to the print statement.

{'title': 'main'} {'title': 'TEST2'}

How to recursively explore python nested dictionaries?

Here's a recursive approach. The idea is to check if the current's dictionary value is an instance of a dict, if it is call the same function with the values as input, otherwise return the dictionary:

def get_inner_dict(d):
for _, v in d.items():
if isinstance(v, dict):
return get_inner_dict(v)
else:
return d

get_inner_dict(d)
# {'c': [1, 2, 3]}

Python recursive function to match key values and return path in nested dictionary

You can simplify your recursion by just checking whether the current key matches the search_value or if not, if the associated value is a dict, in which case you recurse:

def getpath(nested_dict, search_value):
# loop through dict keys
for key in nested_dict.keys():
# have we found the search value?
if key == search_value:
return [key]
# if not, search deeper if this value is a dict
if type(nested_dict[key]) is dict:
path = getpath(nested_dict[key], search_value)
if path is not None:
return [key] + path
# no match found in this part of the dict
return None

getpath(nested_dict,'s') # ['k','p','s']
getpath(nested_dict, 'i') # ['a', 'g','h','i']

How to recursively access every value of every key in dict, including nested ones?

How about this:

def recursive_dict_access(some_dict: dict):
print (f'recursive_dict_access called on {some_dict["name"]}')
for key, value in some_dict.items():
# if value is an empty list
if isinstance(value, list) and not value:
print(f'key: {key} from dict {some_dict["name"}]'}
# if value is a dictionary
elif isinstance(value, dict):
print(f'recursively calling for {value["name"]}'}
recursive_dict_access(value)

This assumes all dicts have a name key and is very verbose as an example for debugging to show how the code works. An example setup and usage:

a = {'name': 'a', 'key1', [], 'key2', []}
b = {'name': 'b', 'key1', a, 'key2': 'some other value'}
c = {'name': 'c', 'key1', b, 'key2': [], 'key3', []}

recursive_dict_access(c)

Output:

recursive_dict_access called on c
recusively calling for b
recursive_dict_access called on b
recursively calling for a
recursive_dict_access called on a
key: key1 from dict a
key: key2 from dict a
key: key2 from dict c
key: key3 from dict c

as expected, all keys in c (and all child dicts) with value [] have been printed. One thing to watch out for is infinite recursion due to circular references in the dictionary. For example:

c['key2'] = c
recursive_dict_access(c)

Output:

... lots of repeated keys ...
RecursionError: maximum recursion depth exceeded while calling a Python object

If this is a possibility in your data structure, you'll want to somehow mark those dicts that have already been process and ignore them if they get called again. I'll leave that as an exercise for the reader ;)

Find all occurrences of a key in nested dictionaries and lists

I found this Q/A very interesting, since it provides several different solutions for the same problem. I took all these functions and tested them with a complex dictionary object. I had to take two functions out of the test, because they had to many fail results and they did not support returning lists or dicts as values, which i find essential, since a function should be prepared for almost any data to come.

So i pumped the other functions in 100.000 iterations through the timeit module and output came to following result:

0.11 usec/pass on gen_dict_extract(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
6.03 usec/pass on find_all_items(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.15 usec/pass on findkeys(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.79 usec/pass on get_recursively(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.14 usec/pass on find(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.36 usec/pass on dict_extract(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -

All functions had the same needle to search for ('logging') and the same dictionary object, which is constructed like this:

o = { 'temparature': '50', 
'logging': {
'handlers': {
'console': {
'formatter': 'simple',
'class': 'logging.StreamHandler',
'stream': 'ext://sys.stdout',
'level': 'DEBUG'
}
},
'loggers': {
'simpleExample': {
'handlers': ['console'],
'propagate': 'no',
'level': 'INFO'
},
'root': {
'handlers': ['console'],
'level': 'DEBUG'
}
},
'version': '1',
'formatters': {
'simple': {
'datefmt': "'%Y-%m-%d %H:%M:%S'",
'format': '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
}
}
},
'treatment': {'second': 5, 'last': 4, 'first': 4},
'treatment_plan': [[4, 5, 4], [4, 5, 4], [5, 5, 5]]
}

All functions delivered the same result, but the time differences are dramatic! The function gen_dict_extract(k,o) is my function adapted from the functions here, actually it is pretty much like the find function from Alfe, with the main difference, that i am checking if the given object has iteritems function, in case strings are passed during recursion:

# python 2
def gen_dict_extract(key, var):
if hasattr(var,'iteritems'): # hasattr(var,'items') for python 3
for k, v in var.iteritems(): # var.items() for python 3
if k == key:
yield v
if isinstance(v, dict):
for result in gen_dict_extract(key, v):
yield result
elif isinstance(v, list):
for d in v:
for result in gen_dict_extract(key, d):
yield result

So this variant is the fastest and safest of the functions here. And find_all_items is incredibly slow and far off the second slowest get_recursivley while the rest, except dict_extract, is close to each other. The functions fun and keyHole only work if you are looking for strings.

Interesting learning aspect here :)

Recursively iterate through a nested dict and return value of the first matching key

You have to check whether the recursive call actually found something so you can continue the loop. E.g. try the following:

def tree_traverse(tree, key):
if key in tree:
return tree[key]
for v in filter(dict.__instancecheck__, tree.values()):
if (found := tree_traverse(v, key)) is not None:
return found



Related Topics



Leave a reply



Submit