Python Pandas: Check If String in One Column Is Contained in String of Another Column in the Same Row

Python Pandas: Check if string in one column is contained in string of another column in the same row

You need apply with in:

df['C'] = df.apply(lambda x: x.A in x.B, axis=1)
print (df)
RecID A B C
0 1 a abc True
1 2 b cba True
2 3 c bca True
3 4 d bac False
4 5 e abc False

Another solution with list comprehension is faster, but there has to be no NaNs:

df['C'] = [x[0] in x[1] for x in zip(df['A'], df['B'])]
print (df)
RecID A B C
0 1 a abc True
1 2 b cba True
2 3 c bca True
3 4 d bac False
4 5 e abc False

Pandas: check if string value in one column is part of string of another column in same row of dataframe - current script returning all Yes

In your case

df['Compare'] = df.apply(lambda x: 'Yes' if x['DEPTH'] in x['Description'] else 'No',axis=1)
df
Out[133]:
WKEY Description DEPTH Compare
0 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 3159 Yes
1 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 3994 No
2 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 5401 No
3 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 3159 No
4 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 3994 Yes
5 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 5401 Yes

Pandas: Determine if a string in one column is a substring of a string in another column

Let us try with numpy defchararray which is vectorized

from numpy.core.defchararray import find
find(df['1'].values.astype(str),df['0'].values.astype(str))!=-1
Out[740]: array([False, True, True, False])

Check if a column contains data from another column in python pandas

split the name into substrings, and use a list comprehension with any to get True is any string matches:

df['result'] = [any(s in url for s in lst)
for lst, url in zip(df['name'].str.split(), df['url'])]

the (slower) equivalent with apply would be:

df['result'] = df.apply(lambda x: any(s in x['url']
for s in x['name'].split()), axis=1)

output:

       name                url  result
0 pau lola www.paulola.com True
1 pou gine www.cheeseham.com False
2 pete raj www.pataraj.com True

Pandas dataframe: Check if regex contained in a column matches a string in another column in the same row

You can't use a pandas builtin method directly. You will need to apply a re.search per row:

import re

mask = df.apply(lambda r: bool(re.search(r['patterns'], r['strings'])), axis=1)
df2 = df[mask]

or using a (faster) list comprehension:

mask = [bool(re.search(p,s)) for p,s in zip(df['patterns'], df['strings'])]

output:

  strings patterns group
0 apple \ba 1
3 train n\b 2
4 tan n\b 2

Check if string in one column is contained in string of another column in the same row and add new column with matching column name

>>> df
to_find col1 col2
0 a ab ac
1 b aa ba
2 c bc ee
>>> df['found_in'] = df.apply(lambda x: ' '.join(x.iloc[1:][x.iloc[1:].str.contains(str(x['to_find']))].index), axis=1)
>>> df
to_find col1 col2 found_in
0 a ab ac col1 col2
1 b aa ba col2
2 c bc ee col1

For better readability,

>>> def get_columns(x):
... y = x.iloc[1:]
... return y.index[y.str.contains(str(x['to_find']))]
...
>>> df['found_in'] = df.apply(lambda x: ' '.join(get_columns(x)), axis=1)

Returning yes in one column if it contains a string in another column

MaterialsTracking_df['Valid Site'] = "Y" if ...

assigns a value to all rows.

Use pandas.DataFrame.apply instead
https://pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/pandas.DataFrame.apply.html

example (I added another dummy row where the condition doesn't meet):

import pandas as pd
from io import StringIO

Materials_Tracking_df = pd.read_csv(StringIO("""
EXAM;Scoring Site DBN;Exams for this Site
MXRC;04M435;MXRC, MXRK, MXRN
MXRC;04M435;MXRC, MXRK, MXRN
SXRK;03M076;SXRK, SXRU
MXRC;04M435;MXRC, MXRK, MXRN
SXRK;04____;MXRC, MXRK, MXRN
"""), sep=';')

Materials_Tracking_df['Valid Site'] = Materials_Tracking_df.apply(
lambda r: 'T' if r['EXAM'] in r['Exams for this Site'] else 'N'
, axis=1)

   EXAM Scoring Site DBN Exams for this Site Valid Site
0 MXRC 04M435 MXRC, MXRK, MXRN T
1 MXRC 04M435 MXRC, MXRK, MXRN T
2 SXRK 03M076 SXRK, SXRU T
3 MXRC 04M435 MXRC, MXRK, MXRN T
4 SXRK 04____ MXRC, MXRK, MXRN N

python pandas - Check if partial string in column exists in other column

This looks like an expensive operation. You can try:

df['col2'].apply(lambda x: 'Yes' if df['col1'].str.contains(x).any() else 'No')

Output:

0     No
1 Yes
2 Yes
Name: col2, dtype: object

Most efficient way of checking whether a string is present in another column's values in Pandas

Try using issubset() with str.split():

df["check1_output"] = df.apply(lambda x: set(x["items_check1"].split(",")).issubset(x["all_items"].split(",")), axis=1)
df["check2_output"] = df.apply(lambda x: set(x["items_check2"].split(",")).issubset(x["all_items"].split(",")), axis=1)
>>> df
id all_items ... check1_output check2_output
0 1239 foobar,foo,foofoo,bar ... True True
1 3298 foobar,foo ... True False
2 9384 foo,bar ... False True


Related Topics



Leave a reply



Submit