How to Search for a String in One Column in Other Columns of a Data Frame

R - How to search for a string in one column in other columns of a data frame (ignoring spaces)

Just use nzchar to check that your pattern has characters:

transform(df, word_exists=mapply(grepl, pattern=word, x=keywords) & nzchar(word))
#    word          keywords word_exists
# 1 Hello hello goodbye nyc       FALSE
# 2       hello goodbye nyc       FALSE
# 3   nyc hello goodbye nyc        TRUE
# 4       hello goodbye nyc       FALSE

Search for string in one column using strings from another column in another dataframe in R

One approach would be to form a regex alternation of the terms in the first dataframe. Then use grepl and sub to generate the output columns.

regex <- paste0("\\b(", paste(df1$SName, collapse="|"), ")\\b")
df2$match <- ifelse(grepl(regex, df2$Description), "Yes", "No")
df2$String <- ifelse(grepl(regex, df2$Description),
                     sub(paste0(".*", regex, ".*"), "\\1", df2$Description),
                     "")
df2

            Description match String
1     - ls svc368 -@#@#    No       
2   mkdir test svc #*-/    No       
3 mkdir df2 svc123 #*-/   Yes svc123
...

String matching from one data frame column to another data frame column

I'm going to assume that in the DataFrames are strings as we typically don't use Dataframes to carry variables. With this I created a sample with your dataframe values.

data_a = {"Value": ["valid username", "valid username", "Password", "Password", "Login", "LOG IN"],
         "Filed": ["username", "input_txtuserid", "input_txtpassword", "txtPassword", "input_submit_log_in", "SIGNIN"]}

data_b = {"Value": ["input_txtuserid", "input_txtpassword", "input_submit_log_in", "Password", "City", "PLACE"],
          "Filed": ["JOHN", "78945", "Sucessfully", "txtPassword", "London", "4-A avenue Street"]}

A = pd.DataFrame(data_a)
B = pd.DataFrame(data_b)

A looks like:
Sample Image

B looks like:
Sample Image

Below the code to create C:

# Merging A and B, using a left join on the columns Filed for A and Value for B. Creatingg an indicator where exists
C = pd.merge(A, B, left_on=['Filed'], right_on=['Value'], how='left', indicator='Exist')

# If exists put true, otherwise false
C['Exist'] = np.where(C.Exist == 'both', True, False)
# Dropping all False so those that dont exist in both dataframes
C.drop(C[C['Exist'] == False].index, inplace=True)

# Making sure C has the right column and column names.
C = C[['Value_y', 'Filed_y']]
C.rename(columns = {"Value_y": "Value",
                    "Filed_y": "Filed"}, inplace = True)

Output of C
Sample Image

Hope that helps! Please Mark this as answer if it does :)

How to search a string in one pandas dataframe column as a substring in another dataframe column

Idea is create sets by split by , and match by issubset:

d = {k: set(v.split(',')) for k, v in df2.set_index('A')['B'].items()}
df1['B'] = [next(iter([k for k, v in d.items() if set(x.split(',')).issubset(v)]), '') 
                      for x in df1['A']]
print (df1)
                        A          B
0      9.female.ceo.,ceo,           
1      9.female.ned.,ned,           
2    9.female.ned.,chair,           
3       2.female.ed.,ned,      ,ned,
4       2.female.ned.,ed,           
5    9.female.chair.,ceo,  ,ceo,ned,
6  2.female.chair.,chair,

Solution with test by in:

d = df2.set_index('A')['B']
df1['B'] = [next(iter([k for k, v in d.items() if x in v]), '')  for x in df1['A']]
print (df1)
                        A          B
0      9.female.ceo.,ceo,           
1      9.female.ned.,ned,           
2    9.female.ned.,chair,           
3       2.female.ed.,ned,      ,ned,
4       2.female.ned.,ed,           
5    9.female.chair.,ceo,  ,ceo,ned,
6  2.female.chair.,chair,

Another solution with cross join by merge with test substrings by in:

df3 = df1.assign(tmp=1).merge(df2.assign(tmp=1), on='tmp', suffixes=('','_'))
df3 = df3.loc[[a in b for a, b in zip(df3['A'], df3['B_'])], ['A','A_']]

df = df1[['A']].merge(df3.rename(columns={'A_':'B'}), on='A', how='left')
print (df)
                        A          B
0      9.female.ceo.,ceo,        NaN
1      9.female.ned.,ned,        NaN
2    9.female.ned.,chair,        NaN
3       2.female.ed.,ned,      ,ned,
4       2.female.ned.,ed,        NaN
5    9.female.chair.,ceo,  ,ceo,ned,
6  2.female.chair.,chair,        NaN

Python Pandas: Check if string in one column is contained in string of another column in the same row

You need apply with in:

df['C'] = df.apply(lambda x: x.A in x.B, axis=1)
print (df)
   RecID  A    B      C
0      1  a  abc   True
1      2  b  cba   True
2      3  c  bca   True
3      4  d  bac  False
4      5  e  abc  False

Another solution with list comprehension is faster, but there has to be no NaNs:

df['C'] = [x[0] in x[1] for x in zip(df['A'], df['B'])]
print (df)
   RecID  A    B      C
0      1  a  abc   True
1      2  b  cba   True
2      3  c  bca   True
3      4  d  bac  False
4      5  e  abc  False

Extract column value based on another column in Pandas

You could use loc to get series which satisfying your condition and then iloc to get first element:

In [2]: df
Out[2]:
    A  B
0  p1  1
1  p1  2
2  p3  3
3  p2  4

In [3]: df.loc[df['B'] == 3, 'A']
Out[3]:
2    p3
Name: A, dtype: object

In [4]: df.loc[df['B'] == 3, 'A'].iloc[0]
Out[4]: 'p3'

Search columns for a specific set of text and if the text is found enter new a new string of text in a new column pandas

There's definitely a more optimized solution, but hope this puts you on the right path...basically loops through each row, looping through the columns and potential fuel strings and decides which abbr to use:

d={'diesel':'DSL','gasoline':'GAS','ev':'ELEC'}
df['all'] = df.apply(''.join, axis=1)
for i,row in df.iterrows():
    df.at[i,'FUEL'] = d[[key for key in d.keys() if key in row['all'].lower()][0]]

del df['all']

output:

                  SUMN              SOUN      MATN  FUEL
0   Light duty vehicle  Diesel Tire wear    Rubber   DSL
1    Heavy duty diesel      Non-catalyst    Diesel   DSL
2     Light duty truck          catalyst  Gasoline   GAS
3  Medium duty vehicle     EV brake wear    brakes  ELEC

this assume that only one of the fuel types occurs in each row

EDIT: inspired by the other solution:

import re
d={'diesel':'DSL','gasoline':'GAS','ev':'ELEC'}
df['FUEL'] = df.apply(lambda x: d[re.search('gasoline|diesel|ev',''.join(x).lower()).group()], axis=1)

same output :)

Check if string is in a pandas dataframe

a['Names'].str.contains('Mel') will return an indicator vector of boolean values of size len(BabyDataSet)

Therefore, you can use

mel_count=a['Names'].str.contains('Mel').sum()
if mel_count>0:
    print ("There are {m} Mels".format(m=mel_count))

Or any(), if you don't care how many records match your query

if a['Names'].str.contains('Mel').any():
    print ("Mel is there")

substring of an entire column in pandas dataframe

Use the str accessor with square brackets:

df['col'] = df['col'].str[:9]

Or str.slice:

df['col'] = df['col'].str.slice(0, 9)