How to Determine Whether a Pandas Column Contains a Particular Value

Check if certain value is contained in a dataframe column in pandas

I think you need str.contains, if you need rows where values of column date contains string 07311954:

print df[df['date'].astype(str).str.contains('07311954')]

Or if type of date column is string:

print df[df['date'].str.contains('07311954')]

If you want check last 4 digits for string 1954 in column date:

print df[df['date'].astype(str).str[-4:].str.contains('1954')]

Sample:

print df['date']
0 8152007
1 9262007
2 7311954
3 2252011
4 2012011
5 2012011
6 2222011
7 2282011
Name: date, dtype: int64

print df['date'].astype(str).str[-4:].str.contains('1954')
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
Name: date, dtype: bool

print df[df['date'].astype(str).str[-4:].str.contains('1954')]
cmte_id trans_typ entity_typ state employer occupation date \
2 C00119040 24K CCM MD NaN NaN 7311954

amount fec_id cand_id
2 1000 C00140715 H2MD05155

Check if value is in Pandas dataframe column

You don't need a if loop. You can directly use Series.eq with any to check if any row has -1 for this column:

In [990]: df['PositionEMA25M50M'].eq(-1).any()
Out[990]: True

Pandas check if splitted dataframe's field contains value

You can use .str.contains() to check the existence of a word in a string.

df[df['Hobby'].str.contains('Sport')]

EDIT:

We need to split the hobbies, so we can have a new dataframe with a line per person per hobby.

Then we can safely filter on the hobby.

SEARCHED_HOBBY = 'Sport'

df = pd.DataFrame({'Name': ['Tom', 'Mark', 'John'], 'Hobby': ['Food,Sport,Art,Extreme Sport','Sport', 'Coding,Books']})

df['Hobby'] = df['Hobby'].str.split(',')

splitted_hobbies = df.explode('Hobby')
splitted_hobbies = splitted_hobbies[splitted_hobbies['Hobby'] == SEARCHED_HOBBY]

df = df[df['Name'].isin(splitted_hobbies['Name'])]

How to check if a pandas column has anything other than specified values?

We can filter on L and R and then we get the opposite of that filter using the ~ operator like so:

df[~(df['A'].isin(['L', 'R']))]

To get a boolean value indicating that additional values are present in the Series, we can write:

len(df[~(df['A'].isin(['L', 'R']))]) == 0

We can be being even shorter and quicker by using the pandas.Series.any method which also returns a boolean value:

~(df['A'].isin(['L', 'R'])).any()

Check if string is in a pandas dataframe

a['Names'].str.contains('Mel') will return an indicator vector of boolean values of size len(BabyDataSet)

Therefore, you can use

mel_count=a['Names'].str.contains('Mel').sum()
if mel_count>0:
print ("There are {m} Mels".format(m=mel_count))

Or any(), if you don't care how many records match your query

if a['Names'].str.contains('Mel').any():
print ("Mel is there")

If a dataframe contains a value in a column, how to perform a calculation in another column? - Pandas/ Python

You could check if column "A" values are 123 or not and use mask on "C" to replace values there:

df['C'] = df['C'].mask(df['A']==123, df['B']*0.0008)

Output:

     A     B    C
0 123 1500 1.2

checking for existence of a value in a Pandas dataframe column

Change in to isin follow by any

exists =  df.make.isin(['EE']).any()


Related Topics



Leave a reply



Submit