Filter a Pandas Dataframe Using Values from a Dict

Filter a pandas dataframe using values from a dict

IIUC, you should be able to do something like this:

>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3

This works by making a Series to compare against:

>>> pd.Series(filter_v)
A 1
B 0
C right
dtype: object

Selecting the corresponding part of df1:

>>> df1[list(filter_v)]
A C B
0 1 right 1
1 0 right 1
2 1 wrong 1
3 1 right 0
4 NaN right 1

Finding where they match:

>>> df1[list(filter_v)] == pd.Series(filter_v)
A B C
0 True False True
1 False False True
2 True False False
3 True True True
4 False False True

Finding where they all match:

>>> (df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)
0 False
1 False
2 False
3 True
4 False
dtype: bool

And finally using this to index into df1:

>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3

How to iterate and filter a dataframe using dictionary in python?

Try:

dct = {"a_w": 5, "c_y": 8}

df["col3"] = (df["col1"] + "_" + df["col2"]).map(dct).fillna("")
print(df)

Prints:

  col1 col2 col3
0 a w 5.0
1 b x
2 c y 8.0
3 d z

How to filter pandas dataframe rows based on dictionary keys and values?

Flatten the dictionary and create a new dataframe, then inner merge df with the new dataframe

df.merge(pd.DataFrame([{'Customer_ID': k, 'Category': i} 
for k, v in d.items() for i in v]))


   Customer_ID  Category  Type  Delivery
0 40275 Book Buy True
1 40275 Software Sell False
2 39900 Book Buy True
3 35886 Software Sell False
4 40350 Software Sell True
5 28129 Software Buy False

Filter a pandas dataframe columns and rows using values from a dict

Solution

We can use isin to create a boolean mask, but before that you have to make sure that all the values in the dict_filter are list of strings

d = {k: np.atleast_1d(v) for k, v in dict_filter.items()}
df[df[list(d)].isin(d).all(1)]


   id     A       B       C
1 2 high medium bottom

filter dataframe using dictionary with multiple values

Use dict comprehension for select dynamic by columns names with values in lists by Series.isin with np.logical_and and reduce trick:

Notice - If use isin in dict all values has to be list

df = df[np.logical_and.reduce([df[k].isin(v) for k, v in sidebars.items()])]
print (df)
source_number location category
0 11199 loc2 cat1
3 32345 loc1 cat3
4 12342 loc2 cat3
5 1232 loc2 cat3
7 123244 loc2 cat3

If possible scalars or lists in dict is possible use if-else in list comprehension with test scalars by Series.eq:

#let say the created dictionary have the below value
sidebars = {"location":["loc1","loc2"],"category":"cat3"}

L = [df[k].isin(v) if isinstance(v, list) else df[k].eq(v) for k, v in sidebars.items()]
df = df[np.logical_and.reduce(L)]
print (df)
source_number location category
3 32345 loc1 cat3
4 12342 loc2 cat3
5 1232 loc2 cat3
7 123244 loc2 cat3

EDIT: If possible some column no match by keys of dict is possible filter it (but then not filtered by this not matched key):

L = [df[k].isin(v) for k, v in sidebars.items() if k in df.columns]
L = [df[k].isin(v) if isinstance(v, list) 
else df[k].eq(v)
for k, v in sidebars.items() if k in df.columns]


df = df[np.logical_and.reduce(L)]

EDIT:

First time code in streamlit, so possible better solutions, here is problem if passed empty dictionary.

So possible check it by if bool(sidebars):

is_check = st.checkbox("Display Data")
if is_check:
st.table(df)

columns = st.sidebar.multiselect("Enter the variables", df.columns)

sidebars = {}
for y in columns:
ucolumns=list(df[y].unique())
print (ucolumns)

sidebars[y]=st.sidebar.multiselect('Filter '+y, ucolumns)

if bool(sidebars):
L = [df[k].isin(v) if isinstance(v, list)
else df[k].eq(v)
for k, v in sidebars.items() if k in df.columns]

df1 = df[np.logical_and.reduce(L)]
st.table(df1)

Filtering pandas dataframe using dictionary for column values

You can check with isin

df_x[df_x[['id','Rank']].astype(str).apply(tuple,1).isin(filter_dict.items())]
Out[182]:
id B Rank D
0 1 1 1 1
5 2 0 3 6
7 3 0 2 8

Filter pandas dataframe by dictionary key value pairs

List comprehension and boolean indexing with concat

df_new = pd.concat([df[(df['col1'] == k) & (df['col3'] > v)] for k,v in filter_dict.items()])

col1 col2 col3
1 1 2 6
2 1 2 7
4 2 2 9

Filtering a dataframe using a dictionary's values

Use DataFrame.filter with regex - join values by | for regex or - it means for key C are selected columns with B or E or C:

d = {'A':['A'], 'B':['A', 'B', 'C'], 'C':['B', 'E', 'F']}

dfs = {k:df.filter(regex='|'.join(v)) for k, v in d.items()}


Related Topics



Leave a reply



Submit