Filter a pandas dataframe using values from a dict
IIUC, you should be able to do something like this:
>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3
This works by making a Series to compare against:
>>> pd.Series(filter_v)
A 1
B 0
C right
dtype: object
Selecting the corresponding part of df1
:
>>> df1[list(filter_v)]
A C B
0 1 right 1
1 0 right 1
2 1 wrong 1
3 1 right 0
4 NaN right 1
Finding where they match:
>>> df1[list(filter_v)] == pd.Series(filter_v)
A B C
0 True False True
1 False False True
2 True False False
3 True True True
4 False False True
Finding where they all match:
>>> (df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)
0 False
1 False
2 False
3 True
4 False
dtype: bool
And finally using this to index into df1:
>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3
How to iterate and filter a dataframe using dictionary in python?
Try:
dct = {"a_w": 5, "c_y": 8}
df["col3"] = (df["col1"] + "_" + df["col2"]).map(dct).fillna("")
print(df)
Prints:
col1 col2 col3
0 a w 5.0
1 b x
2 c y 8.0
3 d z
How to filter pandas dataframe rows based on dictionary keys and values?
Flatten the dictionary and create a new dataframe, then inner merge
df
with the new dataframe
df.merge(pd.DataFrame([{'Customer_ID': k, 'Category': i}
for k, v in d.items() for i in v]))
Customer_ID Category Type Delivery
0 40275 Book Buy True
1 40275 Software Sell False
2 39900 Book Buy True
3 35886 Software Sell False
4 40350 Software Sell True
5 28129 Software Buy False
Filter a pandas dataframe columns and rows using values from a dict
Solution
We can use isin
to create a boolean mask, but before that you have to make sure that all the values in the dict_filter
are list of strings
d = {k: np.atleast_1d(v) for k, v in dict_filter.items()}
df[df[list(d)].isin(d).all(1)]
id A B C
1 2 high medium bottom
filter dataframe using dictionary with multiple values
Use dict comprehension for select dynamic by columns names with values in lists by Series.isin
with np.logical_and and reduce
trick:
Notice - If use isin
in dict all values has to be list
df = df[np.logical_and.reduce([df[k].isin(v) for k, v in sidebars.items()])]
print (df)
source_number location category
0 11199 loc2 cat1
3 32345 loc1 cat3
4 12342 loc2 cat3
5 1232 loc2 cat3
7 123244 loc2 cat3
If possible scalars or lists in dict is possible use if-else
in list comprehension with test scalars by Series.eq
:
#let say the created dictionary have the below value
sidebars = {"location":["loc1","loc2"],"category":"cat3"}
L = [df[k].isin(v) if isinstance(v, list) else df[k].eq(v) for k, v in sidebars.items()]
df = df[np.logical_and.reduce(L)]
print (df)
source_number location category
3 32345 loc1 cat3
4 12342 loc2 cat3
5 1232 loc2 cat3
7 123244 loc2 cat3
EDIT: If possible some column no match by keys of dict is possible filter it (but then not filtered by this not matched key):
L = [df[k].isin(v) for k, v in sidebars.items() if k in df.columns]
L = [df[k].isin(v) if isinstance(v, list)
else df[k].eq(v)
for k, v in sidebars.items() if k in df.columns]
df = df[np.logical_and.reduce(L)]
EDIT:
First time code in streamlit, so possible better solutions, here is problem if passed empty dictionary.
So possible check it by if bool(sidebars)
:
is_check = st.checkbox("Display Data")
if is_check:
st.table(df)
columns = st.sidebar.multiselect("Enter the variables", df.columns)
sidebars = {}
for y in columns:
ucolumns=list(df[y].unique())
print (ucolumns)
sidebars[y]=st.sidebar.multiselect('Filter '+y, ucolumns)
if bool(sidebars):
L = [df[k].isin(v) if isinstance(v, list)
else df[k].eq(v)
for k, v in sidebars.items() if k in df.columns]
df1 = df[np.logical_and.reduce(L)]
st.table(df1)
Filtering pandas dataframe using dictionary for column values
You can check with isin
df_x[df_x[['id','Rank']].astype(str).apply(tuple,1).isin(filter_dict.items())]
Out[182]:
id B Rank D
0 1 1 1 1
5 2 0 3 6
7 3 0 2 8
Filter pandas dataframe by dictionary key value pairs
List comprehension and boolean indexing with concat
df_new = pd.concat([df[(df['col1'] == k) & (df['col3'] > v)] for k,v in filter_dict.items()])
col1 col2 col3
1 1 2 6
2 1 2 7
4 2 2 9
Filtering a dataframe using a dictionary's values
Use DataFrame.filter
with regex - join values by |
for regex or
- it means for key C
are selected columns with B
or E
or C
:
d = {'A':['A'], 'B':['A', 'B', 'C'], 'C':['B', 'E', 'F']}
dfs = {k:df.filter(regex='|'.join(v)) for k, v in d.items()}
Related Topics
How to Iterate Through Dictionary in a Dictionary in Django Template
Passing Csrftoken with Python Requests
Accessing a Value in a Tuple That Is in a List
Extract Email Sub-Strings from Large Document
Python - Activate Conda Env Through Shell Script
Find and Replace Values in Xml Using Python
Link Several Popen Commands with Pipes
How to Escape Curly-Brackets in F-Strings
How to Make Built-In Containers (Sets, Dicts, Lists) Thread Safe
Tensorflow: How to Replace or Modify Gradient
Enable Python to Connect to MySQL via Ssh Tunnelling
How to Tell If a String Repeats Itself in Python
How to Add Timezone into a Naive Datetime Instance in Python