How to Populate New Column Based on Values in Other Columns

pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

OK, two steps to this - first is to write a function that does the translation you want - I've put an example together based on your pseudo-code:

def label_race (row):
   if row['eri_hispanic'] == 1 :
      return 'Hispanic'
   if row['eri_afr_amer'] + row['eri_asian'] + row['eri_hawaiian'] + row['eri_nat_amer'] + row['eri_white'] > 1 :
      return 'Two Or More'
   if row['eri_nat_amer'] == 1 :
      return 'A/I AK Native'
   if row['eri_asian'] == 1:
      return 'Asian'
   if row['eri_afr_amer']  == 1:
      return 'Black/AA'
   if row['eri_hawaiian'] == 1:
      return 'Haw/Pac Isl.'
   if row['eri_white'] == 1:
      return 'White'
   return 'Other'

You may want to go over this, but it seems to do the trick - notice that the parameter going into the function is considered to be a Series object labelled "row".

Next, use the apply function in pandas to apply the function - e.g.

df.apply (lambda row: label_race(row), axis=1)

Note the axis=1 specifier, that means that the application is done at a row, rather than a column level. The results are here:

0           White
1        Hispanic
2           White
3           White
4           Other
5           White
6     Two Or More
7           White
8    Haw/Pac Isl.
9           White

If you're happy with those results, then run it again, saving the results into a new column in your original dataframe.

df['race_label'] = df.apply (lambda row: label_race(row), axis=1)

The resultant dataframe looks like this (scroll to the right to see the new column):

      lname   fname rno_cd  eri_afr_amer  eri_asian  eri_hawaiian   eri_hispanic  eri_nat_amer  eri_white rno_defined    race_label
0      MOST    JEFF      E             0          0             0              0             0          1       White         White
1    CRUISE     TOM      E             0          0             0              1             0          0       White      Hispanic
2      DEPP  JOHNNY    NaN             0          0             0              0             0          1     Unknown         White
3     DICAP     LEO    NaN             0          0             0              0             0          1     Unknown         White
4    BRANDO  MARLON      E             0          0             0              0             0          0       White         Other
5     HANKS     TOM    NaN             0          0             0              0             0          1     Unknown         White
6    DENIRO  ROBERT      E             0          1             0              0             0          1       White   Two Or More
7    PACINO      AL      E             0          0             0              0             0          1       White         White
8  WILLIAMS   ROBIN      E             0          0             1              0             0          0       White  Haw/Pac Isl.
9  EASTWOOD   CLINT      E             0          0             0              0             0          1       White         White

A new column in pandas which value depends on other columns

To improve upon other answer, I would use pandas apply for iterating over rows and calculating new column.

def calc_new_col(row):
   if row['col2'] <= 50 & row['col3'] <= 50:
        return row['col1']
    else:
        return max(row['col1'], row['col2'], row['col3'])

df["state"] = df.apply(calc_new_col, axis=1)
# axis=1 makes sure that function is applied to each row

print(df)
            datetime  col1  col2  col3  state
2021-04-10  01:00:00    25    50    50     25
2021-04-10  02:00:00    25    50    50     25
2021-04-10  03:00:00    25   100    50    100
2021-04-10  04:00:00    50    50   100    100
2021-04-10  05:00:00   100   100   100    100

apply helps the code to be cleaner and more reusable.

Create new column based on values from three other columns in R

Additional to the solution by @r2evans in the comment section:

We could use coalesce from dplyr package:

df %>% 
  mutate(d = coalesce(a, b, c))

   a  b  c  d
1  1 NA NA  1
2 NA NA  5  5
3  3 NA NA  3
4 NA  4 NA  4
5 NA 50 NA 50

We could use unite from tidyr package with na.rm argument:

library(tidyr)
library(dplyr)

df %>% 
  unite(d, a:c, na.rm = TRUE, remove = FALSE)

   d  a  b  c
1  1  1 NA NA
2  5 NA NA  5
3  3  3 NA NA
4  4 NA  4 NA
5 50 NA 50 NA

Creating a new column based on conditions for other columns

Use DataFrame.isna for test all columns if missing and then DataFrame.all for test if all Trues per rows:

#If necessary
import numpy as np

df  = df.replace(['Nan', 'NaN'], np.nan)

df['col4'] = np.where(df[['col1','col2','col3']].isna().all(1), 'original', 'referenced')

Your solution with Series.isna:

df['col4'] = np.where(df['col1'].isna() & df['col2'].isna() & df['col3'].isna(), 
                     'original', 'referenced')

Create new column based on other columns values with conditions

You can do it with pandas.DataFrame.apply:

def get_prc(x):
    individual_rate = x["individual_rate"]
    if individual_rate >= 4:
        return x["review_contents"] + " " + str(individual_rate)
    return "Not positive"
 
df["positive_review_contents"] = df[["individual_rate", "review_contents"]].apply(get_prc, axis = 1)

The code above applies the function get_prc row-wise.