Create New Column Based on String

How to create a new columns of dataframe based on string containing condition

You can do it with pd.Series.str.contains with giving the list l as a OR string :

import re
import pandas as pd

df = pd.DataFrame({'Date':['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'], 
                    'Phrases':['I have a cool family', 'I like avocados', 'I would like to go to school', 'I enjoy Harry Potter']})

l=['cool','avocado','lord of the rings']

df['new_column']=df['Phrases'].str.contains('|'.join(l))

df['matched strings']=df['Phrases'].apply(lambda x: ','.join(re.findall('|'.join(l),x)))


df
Out[18]: 
        Date                       Phrases  new_column matched strings
0  10/2/2011          I have a cool family        True            cool
1  11/2/2011               I like avocados        True         avocado
2  12/2/2011  I would like to go to school       False                
3  13/2/2011          I enjoy Harry Potter       False

Add new column to pandas data frame based on string + value from another column in the data frame

Use:

df['axis'] = 'up to ' + df['end'].astype(str)

Creating a new column using string match and based on if-else condition

The root problem here is that your code compares a single string (row['url_text']) to a dataframe (df[df...])

Instead of referencing df inside your function, just use methods that are defined on the row itself. You can also implement this as a lambda function to be closer to the canonical examples.

df['blocked'] = df.apply(
    lambda row: 1 if 'blocked you' in row['url_text'] else 0,
    axis=1
)

Creating new column based on string values from another column

library(stringr)
dom$label = str_extract(dom$Banner, "Watermelon|Vanilla")
dom$label[is.na(dom$label)] <- "Default"
dom
#      Site                              Banner      label
# 1   alpha  testing_Watermelon -DPI_300x250 v2 Watermelon
# 2    beta notest_Vanilla Latte-DPI_300x250 v2    Vanilla
# 3 charlie                         bottle :15s    Default
# 4   delta aaaa vvvv cccc Build_Mobile_320x480    Default

Fill new column based on conditions defined in a string

Here a solution to convert your condition to a python function and then applying it to the rows of your DataFrame:

import re

condition_string =  "colA='yes' & colB='yes' & (colC='yes' | colD='yes'): 'Yes', colA='no' & colB='no' & (colC='no' | colD='no'): 'No', ELSE : 'UNKNOWN'"

# formatting string as python function apply_cond
for col in df.columns:
    condition_string = re.sub(rf"(\W|^){col}(\W|$)", rf"\1row['{col}']\2", condition_string)
    condition_string = re.sub(rf"row\['{col}'\]\s*=(?!=)", f"row['{col}']==", condition_string)

cond_form = re.sub(r'(:[^[(]+), (?!ELSE)', r'\1\n\telif ', condition_string) \
            .replace(": ", ":\n\t\treturn ") \
            .replace("&", "and") \
            .replace('|', 'or')
cond_form = re.sub(r", ELSE\s*:", "\n\telse:", cond_form)
function_def = "def apply_cond(row):\n\tif " + cond_form
#print(function_def) # uncomment to see how the function is defined

# executing the function definition of apply_cond
exec(function_def)

# applying the function to each row
df["result"]=df.apply(lambda x: apply_cond(x), axis=1)

print(df)

Output:

     ID colA colB colC colD   result
0  AB01  yes  NaN  yes  yes  UNKNOWN
1  AB02  yes  yes  yes   no      Yes
2  AB03  yes  yes  yes  yes      Yes
3  AB03   no   no   no   no       No
4  AB04   no   no   no  NaN       No
5  AB05  yes  NaN  NaN   no  UNKNOWN
6  AB06  NaN  yes  NaN  NaN  UNKNOWN

You might want to adapt string formatting depending on condition_string (I did it quickly, there might be some unsupported combinations) but if you get those strings automatically it will save you from defining them all over again.

How can I build a function to create a new column based on other columns containing a certain string?

if you wanted to create a new column with binary values (if condition met then A else B), you could do something like this

#create a column 'new' with value 'Brasil' if 'Nationality' value contains 'Bra', else put 'NA'
df['new'] = df['Nationality'].apply(lambda x: 'Brasil' if 'Bra' in x else 'NA')

otherwise, if you wanted to create a column and use multiple rules in the same column, you could do something like this...

#create a column 'new' and insert value 'ARG' whenever 'Nationality' contains 'Arg', 
df.loc[df['Nationality'].str.contains('Arg'), 'new'] = 'ARG'
#and 'BRA' whenever Nationality contains 'Brazil', without overriding any other values
df.loc[df['Nationality'].str.contains('Brazil'), 'new'] = 'BRA'

adding values in new column based on string contains in another column

Use str.extract to get the substring from the string-based column

d = {'apple': 'A001', 'ball': 'B099', 'fan': 'F009'}

df['category'] = (
    df.descriptions
      .str.lower()
      .str.extract('(' + '|'.join(d.keys()) + ')')
      .squeeze().map(d)
)