Find all occurrences of a key in nested dictionaries and lists
I found this Q/A very interesting, since it provides several different solutions for the same problem. I took all these functions and tested them with a complex dictionary object. I had to take two functions out of the test, because they had to many fail results and they did not support returning lists or dicts as values, which i find essential, since a function should be prepared for almost any data to come.
So i pumped the other functions in 100.000 iterations through the timeit
module and output came to following result:
0.11 usec/pass on gen_dict_extract(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
6.03 usec/pass on find_all_items(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.15 usec/pass on findkeys(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1.79 usec/pass on get_recursively(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.14 usec/pass on find(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.36 usec/pass on dict_extract(k,o)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
All functions had the same needle to search for ('logging') and the same dictionary object, which is constructed like this:
o = { 'temparature': '50',
'logging': {
'handlers': {
'console': {
'formatter': 'simple',
'class': 'logging.StreamHandler',
'stream': 'ext://sys.stdout',
'level': 'DEBUG'
}
},
'loggers': {
'simpleExample': {
'handlers': ['console'],
'propagate': 'no',
'level': 'INFO'
},
'root': {
'handlers': ['console'],
'level': 'DEBUG'
}
},
'version': '1',
'formatters': {
'simple': {
'datefmt': "'%Y-%m-%d %H:%M:%S'",
'format': '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
}
}
},
'treatment': {'second': 5, 'last': 4, 'first': 4},
'treatment_plan': [[4, 5, 4], [4, 5, 4], [5, 5, 5]]
}
All functions delivered the same result, but the time differences are dramatic! The function gen_dict_extract(k,o)
is my function adapted from the functions here, actually it is pretty much like the find
function from Alfe, with the main difference, that i am checking if the given object has iteritems function, in case strings are passed during recursion:
# python 2
def gen_dict_extract(key, var):
if hasattr(var,'iteritems'): # hasattr(var,'items') for python 3
for k, v in var.iteritems(): # var.items() for python 3
if k == key:
yield v
if isinstance(v, dict):
for result in gen_dict_extract(key, v):
yield result
elif isinstance(v, list):
for d in v:
for result in gen_dict_extract(key, d):
yield result
So this variant is the fastest and safest of the functions here. And find_all_items
is incredibly slow and far off the second slowest get_recursivley
while the rest, except dict_extract
, is close to each other. The functions fun
and keyHole
only work if you are looking for strings.
Interesting learning aspect here :)
How to find all occurrence of a key in nested dict, but also keep track of the outer dict key value?
This should work regardless of how deep your nesting is (up to the stack limit, at any rate). The request for keeping track of the dict's key is a little awkward--I used a tuple to return the pair. Note that if the found value is in the outermost dictionary, it won't be in the tuple format.
def recursive_lookup(key, d):
if key in d:
return d[key]
for k, v in d.items():
if isinstance(v, dict):
result = recursive_lookup(key, v)
if result:
return k, result
print(recursive_lookup('label', data))
Output:
('item2', 'label2')
Here's a version that's a little messier (I'm not crazy about an inner function, but at least the accumulator list isn't a parameter and isn't global) but will return a list of all found items nested up to the stack limit, excepting the outermost keys:
def recursive_lookup(key, d):
def _lookup(key, d):
if key in d:
return d[key]
for k, v in d.items():
if isinstance(v, dict):
result = _lookup(key, v)
if result:
accumulator.append((k, result))
accumulator = []
_lookup(key, d)
return accumulator
Output:
[('item3', 'label3'), ('item2', 'label2')]
This can be easily modified if you want to output a dict--replace accumulator = []
with accumulator = {}
and accumulator.append((k, result))
with accumulator[k] = result
, but this might be awkward to work with, and you can't store duplicate key entries.
As for your final question, the reason you're getting None
is because the inner loop returns
after checking the first item whether it found something or not. Since label
is in the second location of the items()
array, it never gets looked at.
Find all occurrences of a key in nested dictionaries and lists - with path
After some experimenting I found the below code to solve my problem.
def gen_dict_location_extract(key, value, path=None):
if path is None:
path = []
if hasattr(value, "items"):
for k, v in value.items():
if k == key: # recursive exit point
if len(path) > 0:
yield (v, path)
else: # handling root keys
yield (v, None)
if isinstance(v, dict):
path_copy = path.copy()
# it is important to do a copy of the path for recursive calls
# so every iteration has its own path object
path_copy.append(k)
yield from gen_dict_location_extract(key, v, path_copy)
elif isinstance(v, list):
yield from gen_dict_location_extract(key, v, path)
def call_gen_dict_location_extract(key, enumerable):
results = []
for result in gen_dict_location_extract(key, enumerable):
results.append(result)
return results
To test the code you would execute it like this:
call_gen_dict_location_extract("level", o)
Changing all occurrences of a key in nested dictionaries and lists
Assuming that:
- every
bakery_items
value has aniscake
value - you want to inpect every
list
element in the dictionary to see if there's nested dictionaries in there
Then your code could be (assuming your dictionary is source
):
source = { } # your dictionary here, with typos like missing commas fixed
def filter_for_cake(d):
return {
key: {
k: v for k, v in value.items() if v['iscake'] == 'yes'
} if key == 'bakery_items' else
[
x if not isinstance(x, dict) else filter_for_cake(x)
for x in value
] if isinstance(value, list) else value
for key, value in d.items()
}
print(filter_for_cake(source))
Some explanation:
- the function iterates over an entire dictionary
- for each value that has
'bakery_items'
as a key, it assumes it is a dictionary and filters it down to only those elements that have their'iscake'
set to'yes'
- for the other values, if the value is a list, it generates a copy, but passes each dictionary to
filter_for_cake
again (recursion) - for the remaining values, it just leaves them in as is
How to find a value of a key in a nested dictionary with lists?
Since you do not know how deep inside the value is, it is prob advisable to use a recursive function to iterate through all the layers till it's found. I used DFS below.
def search(ld, find):
if(type(ld)==list):
for i in ld:
if(type(i)==list or type(i)==dict):
result=search(i, find)
if(result!=None): return result
elif(type(ld)==dict):
try:
return ld[find]
except(KeyError):
for i in ld:
if(type(ld[i])==list or type(ld[i])):
result=search(ld[i], find)
if(result!=None): return result
else:
return None
test_dict1 = {
"blah":"blah",
"alerts": [{"test1":"1", "test":"2"}],
"foo": {
"foo":"bar",
"foo1": [{"test3":"3"}]
}}
print(search(test_dict1, "test3"))
Find path of occurrences of a key, value pair in nested dictionaries and lists
You should use yield to create an iterator. It would make the code simpler and more efficient (Especially if you're not going to always go through all occurrences). To only find the first one, you can use the next function.
def findKeys(d,key,value):
if key in d and d[key] == value: yield [d["name"]]
subLevels = ( (a,v) for a,vl in d.items() if isinstance(vl,list) for v in vl )
for attrib,subDict in subLevels:
if not isinstance(subDict,dict):continue
for path in findKeys(subDict,key,value):
yield [d["name"]]+path
output:
for path in findKeys(d,"type","A7240XM"):
print(path)
['/', 'md', 'level0', 'level1', 'level2', 'something1']
['/', 'md', 'level0', 'level1', 'level2', 'something2']
['/', 'md', 'level0', 'level1', 'ng', 'someother1']
['/', 'md', 'level0', 'level1', 'ng', 'findME']
['/', 'md', 'level0', 'level1', 'be', 'some123']
['/', 'md', 'level0', 'level1', 'be', 'some321']
]
next(findKeys(d,"name","some123"))
['/', 'md', 'level0', 'level1', 'be', 'some123']
Extract fields in a dynamically nested dictionary in an ordered manner
def dict_value(val: dict):
for key, item in val.items():
if type(item) == dict:
dict_value(item)
else:
print(item)
dict_value(data)
How to find all instances of a substring inside a nested dict that could contain more lists or lists of dicts
Lists are mutable. Easiest is to replace by index. you can get this with enumerate
def dict_extract(self, search_str: str, d: dict) -> None:
...
for indx, item in enumerate(v):
if isinstance(item, dict):
self.dict_extract(search_str, item)
if isinstance(item, str):
v[indx] = <your new value>
You might have an easier time manipulating the dictionary as a string instead of as a Python structure?
The function below illustrates a straightforward replace. You could extend it with a regex that replaces everything between quotes " instead of just old
.
import json
def replace_terms_in_dict(old, new, _dict):
string = json.dumps(_dict)
string = string.replace(old, new)
_dict = json.loads(string)
return _dict
Related Topics
How to Send Http Requests to Flask Server
Python Threading with Queue: How to Avoid to Use Join
Detect Specific Keypresses in Gui
Usb Automatic Detection in Python for Linux Env
Module Not Found After Building Python Project by Using Pysinstaller
Passing Variable from Python Script to Bash Script
Why Use Python's Os Module Methods Instead of Executing Shell Commands Directly
Fastest Way to Download 3 Million Objects from a S3 Bucket
No Module Named 'Virtualenvwrapper'
What Is the Reason for Performing a Double Fork When Creating a Daemon
How to See the Entire Http Request That's Being Sent by My Python Application
How to Use Filter, Map, and Reduce in Python 3
How to Invoke the Super Constructor in Python
Filtering Pandas Dataframes on Dates
How to Make Firefox Headless Programmatically in Selenium with Python