Check if certain value is contained in a dataframe column in pandas
I think you need str.contains
, if you need rows where values of column date
contains string 07311954
:
print df[df['date'].astype(str).str.contains('07311954')]
Or if type
of date
column is string
:
print df[df['date'].str.contains('07311954')]
If you want check last 4 digits for string
1954
in column date
:
print df[df['date'].astype(str).str[-4:].str.contains('1954')]
Sample:
print df['date']
0 8152007
1 9262007
2 7311954
3 2252011
4 2012011
5 2012011
6 2222011
7 2282011
Name: date, dtype: int64
print df['date'].astype(str).str[-4:].str.contains('1954')
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
Name: date, dtype: bool
print df[df['date'].astype(str).str[-4:].str.contains('1954')]
cmte_id trans_typ entity_typ state employer occupation date \
2 C00119040 24K CCM MD NaN NaN 7311954
amount fec_id cand_id
2 1000 C00140715 H2MD05155
Check if value is in Pandas dataframe column
You don't need a if
loop. You can directly use Series.eq
with any
to check if any row has -1
for this column:
In [990]: df['PositionEMA25M50M'].eq(-1).any()
Out[990]: True
Pandas check if splitted dataframe's field contains value
You can use .str.contains()
to check the existence of a word in a string.
df[df['Hobby'].str.contains('Sport')]
EDIT:
We need to split the hobbies, so we can have a new dataframe with a line per person per hobby.
Then we can safely filter on the hobby.
SEARCHED_HOBBY = 'Sport'
df = pd.DataFrame({'Name': ['Tom', 'Mark', 'John'], 'Hobby': ['Food,Sport,Art,Extreme Sport','Sport', 'Coding,Books']})
df['Hobby'] = df['Hobby'].str.split(',')
splitted_hobbies = df.explode('Hobby')
splitted_hobbies = splitted_hobbies[splitted_hobbies['Hobby'] == SEARCHED_HOBBY]
df = df[df['Name'].isin(splitted_hobbies['Name'])]
How to check if a pandas column has anything other than specified values?
We can filter on L
and R
and then we get the opposite of that filter using the ~
operator like so:
df[~(df['A'].isin(['L', 'R']))]
To get a boolean value indicating that additional values are present in the Series, we can write:
len(df[~(df['A'].isin(['L', 'R']))]) == 0
We can be being even shorter and quicker by using the pandas.Series.any
method which also returns a boolean value:
~(df['A'].isin(['L', 'R'])).any()
Check if string is in a pandas dataframe
a['Names'].str.contains('Mel')
will return an indicator vector of boolean values of size len(BabyDataSet)
Therefore, you can use
mel_count=a['Names'].str.contains('Mel').sum()
if mel_count>0:
print ("There are {m} Mels".format(m=mel_count))
Or any()
, if you don't care how many records match your query
if a['Names'].str.contains('Mel').any():
print ("Mel is there")
If a dataframe contains a value in a column, how to perform a calculation in another column? - Pandas/ Python
You could check if column "A" values are 123 or not and use mask
on "C" to replace values there:
df['C'] = df['C'].mask(df['A']==123, df['B']*0.0008)
Output:
A B C
0 123 1500 1.2
checking for existence of a value in a Pandas dataframe column
Change in to isin
follow by any
exists = df.make.isin(['EE']).any()
Related Topics
How to Extract a Value (I Want an Int Not Row) from a Dataframe and Do Simple Calculations on It
How to Convert Data from Txt Files to Excel Files Using Python
Find the Index of a Value in a 2D Array
Add Padding to Images to Get Them into the Same Shape
How to Get the Return Value from a Thread in Python
Changing Only One Row to Column in Python
How to Get the Previous Element When Using a for Loop
How to Execute Two Commands in Terminal Using Python'S Subprocess Module
Cmake on Linux Centos 7, How to Force the System to Use Cmake3
How to Convert Datetime by Removing Nanoseconds
Python, Pandas:Write Content of Dataframe into Text File
Use Variable as Key Name in Python Dictionary
Convert Images from [-1; 1] to [0; 255]
How to Remove an Item from a List in Python If That Item Contains a Word