How to check if a value is in the list in selection from pandas data frame?
Use isin
df_new[df_new['l_ext'].isin([31, 22, 30, 25, 64])]
How to check if Pandas column has value from list of string?
Use apply
and lambda
like:
df['Names'].apply(lambda x: any([k in x for k in kw]))
0 True
1 True
2 True
3 True
4 False
Name: Names, dtype: bool
Pandas Dataframe Check if column value is in column list
Use apply
:
df['flag'] = df.apply(lambda x: int(x['id'] in x['idlist']), axis=1)
print (df)
id idlist flag
0 12 [1, 5, 7, 12, 112] 1
1 112 [5, 7, 12, 111, 113] 0
Similar:
df['flag'] = df.apply(lambda x: x['id'] in x['idlist'], axis=1).astype(int)
print (df)
id idlist flag
0 12 [1, 5, 7, 12, 112] 1
1 112 [5, 7, 12, 111, 113] 0
With list comprehension
:
df['flag'] = [int(x[0] in x[1]) for x in df[['id', 'idlist']].values.tolist()]
print (df)
id idlist flag
0 12 [1, 5, 7, 12, 112] 1
1 112 [5, 7, 12, 111, 113] 0
Solutions for filtering:
df = df[df.apply(lambda x: x['id'] in x['idlist'], axis=1)]
print (df)
id idlist
0 12 [1, 5, 7, 12, 112]
df = df[[x[0] in x[1] for x in df[['id', 'idlist']].values.tolist()]]
print (df)
id idlist
0 12 [1, 5, 7, 12, 112]
Filter dataframe rows if value in column is in a set list of values
Use the isin
method:
rpt[rpt['STK_ID'].isin(stk_list)]
Check if certain value is contained in a dataframe column in pandas
I think you need str.contains
, if you need rows where values of column date
contains string 07311954
:
print df[df['date'].astype(str).str.contains('07311954')]
Or if type
of date
column is string
:
print df[df['date'].str.contains('07311954')]
If you want check last 4 digits for string
1954
in column date
:
print df[df['date'].astype(str).str[-4:].str.contains('1954')]
Sample:
print df['date']
0 8152007
1 9262007
2 7311954
3 2252011
4 2012011
5 2012011
6 2222011
7 2282011
Name: date, dtype: int64
print df['date'].astype(str).str[-4:].str.contains('1954')
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
Name: date, dtype: bool
print df[df['date'].astype(str).str[-4:].str.contains('1954')]
cmte_id trans_typ entity_typ state employer occupation date \
2 C00119040 24K CCM MD NaN NaN 7311954
amount fec_id cand_id
2 1000 C00140715 H2MD05155
Pandas dataframe select rows where a list-column contains any of a list of strings
IIUC Re-create your df then using isin
with any
should be faster than apply
df[pd.DataFrame(df.species.tolist()).isin(selection).any(1).values]
Out[64]:
molecule species
0 a [dog]
2 c [cat, dog]
3 d [cat, horse, pig]
Check if a string in a Pandas DataFrame column is in a list of strings
frame = pd.DataFrame({'a' : ['the cat is blue', 'the sky is green', 'the dog is black']})
frame
a
0 the cat is blue
1 the sky is green
2 the dog is black
The str.contains
method accepts a regular expression pattern:
mylist = ['dog', 'cat', 'fish']
pattern = '|'.join(mylist)
pattern
'dog|cat|fish'
frame.a.str.contains(pattern)
0 True
1 False
2 True
Name: a, dtype: bool
Because regex patterns are supported, you can also embed flags:
frame = pd.DataFrame({'a' : ['Cat Mr. Nibbles is blue', 'the sky is green', 'the dog is black']})
frame
a
0 Cat Mr. Nibbles is blue
1 the sky is green
2 the dog is black
pattern = '|'.join([f'(?i){animal}' for animal in mylist]) # python 3.6+
pattern
'(?i)dog|(?i)cat|(?i)fish'
frame.a.str.contains(pattern)
0 True # Because of the (?i) flag, 'Cat' is also matched to 'cat'
1 False
2 True
Use a list of values to select rows from a Pandas dataframe
You can use the isin
method:
In [1]: df = pd.DataFrame({'A': [5,6,3,4], 'B': [1,2,3,5]})
In [2]: df
Out[2]:
A B
0 5 1
1 6 2
2 3 3
3 4 5
In [3]: df[df['A'].isin([3, 6])]
Out[3]:
A B
1 6 2
2 3 3
And to get the opposite use ~
:
In [4]: df[~df['A'].isin([3, 6])]
Out[4]:
A B
0 5 1
3 4 5
How to determine whether a Pandas Column contains a particular value
in
of a Series checks whether the value is in the index:
In [11]: s = pd.Series(list('abc'))
In [12]: s
Out[12]:
0 a
1 b
2 c
dtype: object
In [13]: 1 in s
Out[13]: True
In [14]: 'a' in s
Out[14]: False
One option is to see if it's in unique values:
In [21]: s.unique()
Out[21]: array(['a', 'b', 'c'], dtype=object)
In [22]: 'a' in s.unique()
Out[22]: True
or a python set:
In [23]: set(s)
Out[23]: {'a', 'b', 'c'}
In [24]: 'a' in set(s)
Out[24]: True
As pointed out by @DSM, it may be more efficient (especially if you're just doing this for one value) to just use in directly on the values:
In [31]: s.values
Out[31]: array(['a', 'b', 'c'], dtype=object)
In [32]: 'a' in s.values
Out[32]: True
Related Topics
Elegant Way to Check If a Nested Key Exists in a Dict
Pandas Create New Column with Count from Groupby
How to Mark a Portion of a Text Widget as Readonly
How to Request a Url in Python and Not Follow Redirects
Why Does Defining _Getitem_ on a Class Make It Iterable in Python
Download Image with Selenium Python
Creating Spark Data Structure from Multiline Record
Number of Days Between 2 Dates, Excluding Weekends
Find Longest Repetitive Sequence in a String
List of Dicts To/From Dict of Lists
Capture Arbitrary Path in Flask Route
Word Count from a Txt File Program
How to Set a Default Parameter Equal to Another Parameter Value
Multiple Linear Regression in Python
How to Insert Pandas Dataframe via MySQLdb into Database
How to Specify Your Own Distance Function Using Scikit-Learn K-Means Clustering