Search for String in all Pandas DataFrame columns and filter
The Series.str.contains
method expects a regex pattern (by default), not a literal string. Therefore str.contains("^")
matches the beginning of any string. Since every string has a beginning, everything matches. Instead use str.contains("\^")
to match the literal ^
character.
To check every column, you could use for col in df
to iterate through the column names, and then call str.contains
on each column:
mask = np.column_stack([df[col].str.contains(r"\^", na=False) for col in df])
df.loc[mask.any(axis=1)]
Alternatively, you could pass regex=False
to str.contains
to make the test use the Python in
operator; but (in general) using regex is faster.
How to filter dataframe columns between two rows that contain specific string in column?
If both values are present you temporarily set "String" as index:
df.set_index('String').loc['Start':'End'].reset_index()
output:
String Value
0 Start 65
1 Orange 33
2 Purple 65
3 Teal 34
4 Indigo 44
5 End 32
Alternatively, using isin
(then the order of Start/End doesn't matter):
m = df['String'].isin(['Start', 'End']).cumsum().eq(1)
df[m|m.shift()]
output:
String Value
3 Start 65
4 Orange 33
5 Purple 65
6 Teal 34
7 Indigo 44
8 End 32
Searching for string in all columns of dataframe in Python
Create boolean DataFrame
and check at least one True
per row by DataFrame.any
and filter by boolean indexing
:
df = df[df.eq('a').any(axis=1)]
print (df)
A B
0 a b
2 e a
Detail:
print (df.eq('a'))
A B
0 True False
1 False False
2 False True
print(df.eq('a').any(axis=1))
0 True
1 False
2 True
dtype: bool
If want check substring
s use str.contains
for boolean DataFrame
:
df = pd.DataFrame([['ad', 'b'], ['c', 'd'], ['e', 'asw']], columns=["A", "B"])
print (df)
A B
0 ad b
1 c d
2 e asw
df = df[df.apply(lambda x: x.str.contains('a')).any(axis=1)]
Or applymap
for elemnt wise checking by in
:
df = df[df.applymap(lambda x: 'a' in x).any(axis=1)]
print (df)
A B
0 ad b
2 e asw
Pandas filter dataframe columns through substring match
You can iterate over index axis:
>>> df[df.apply(lambda x: x['Name'].lower() in x['Fname'].lower(), axis=1)]
Name Age Fname
1 Bob 12 Bob
2 Clarke 13 clarke
str.contains
takes a constant in first argument pat
not a Series
.
How to filter rows containing a string pattern from a Pandas dataframe
In [3]: df[df['ids'].str.contains("ball")]
Out[3]:
ids vals
0 aball 1
1 bball 2
3 fball 4
Efficient way to search string contains in multiple columns using pandas
You can do this with a lambda function
In [40]: df[['test_string_1', 'test_string_2']].apply(lambda x: x.str.contains('Rajini|God|Thalaivar',case=False)).any(axis=1).astype(int)
Out[40]:
0 1
1 1
2 0
3 1
4 0
5 1
dtype: int64
filtering data in pandas where string is in multiple columns
new_df_1 = df[df.team_1 =='ENG'][['team_1', 'score_1']]
new_df_1 =new_df_1.rename(columns={"team_1":"team", "score_1":"score"})
# team score
# 0 ENG 1
new_df_2 = df[df.team_2 =='ENG'][['team_2', 'score_2']]
new_df_2 = new_df_2.rename(columns={"team_2":"team", "score_2":"score"})
# team score
# 1 ENG 2
then concat two dataframe:
pd.concat([new_df_1, new_df_2])
the output is :
team score
0 ENG 1
1 ENG 2
Filter pandas dataframe if value of column is within a string
You can use .apply
+ in
operator:
s = "ZA1127B.48"
print(df[df.apply(lambda x: x.Part_Number in s, axis=1)])
Prints:
Part_Number
0 A1127
Related Topics
Python - Is a Dictionary Slow to Find Frequency of Each Character
Coalesce Values from 2 Columns into a Single Column in a Pandas Dataframe
Cleanest Way to Get Last Item from Python Iterator
How to Create a Custom Activation Function with Keras
Heatmap in Matplotlib with Pcolor
How to Properly Assert That an Exception Gets Raised in Pytest
Run Command and Get Its Stdout, Stderr Separately in Near Real Time Like in a Terminal
Keyerror: 'Tcl_Library' When I Use Cx_Freeze
Find the Recaptcha Element and Click on It -- Python + Selenium
How to Install 2 Anacondas (Python 2 and 3) on MAC Os
Reimport a Module While Interactive
Why Are There No ++ and -- Operators in Python
Python: How to Make the Ansi Escape Codes to Work Also in Windows
What Is Python Whitespace and How Does It Work