Populating Pandas Columns Based on Values in Other Columns

Populate a panda's dataframe column based on another column and dictionary value

You can explode "DIAGNOSES" column, get the first elements of each string using str, map diagnoses dictionary to get types, groupby the index and aggregate to a list:

df['DIAGNOSES_TYPE'] = df['DIAGNOSES'].explode().str[0].map(diagnoses).groupby(level=0).apply(list)

Output:

              DIAGNOSES                         DIAGNOSES_TYPE
0 [A03] [Arbitrary]
1 [A03, B23] [Arbitrary, Brutal]
2 [A30, B54, D65, C60] [Arbitrary, Brutal, Dropped, Cluster]

Populate Pandas Dataframe column from other columns based on a condition and previous row value

import numpy as np
df['Hlv'] = np.NaN
df.loc[df.Close>df.SMA_High,'Hlv'] = 1
df.loc[df.Close<df.SMA_Low,'Hlv'] = -1
df.fillna(method='ffill',inplace=True)

populating pandas columns based on values in other columns

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 

df.columns=df.columns.str.replace('\d+','')
df.columns=df.columns.str.split('-').str[-1]
newdf=df.groupby(level=0,axis=1).first()
newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()
newdf
Out[40]:
Address City ID State
0 6th street Mpls 1 MN
1 15th St Flint 2 MI
2 Essexb St New York 3 NY
3 7 street SE Mpls 4 MN

How to populate a column based on values from multiple columns in python?

Here is one way of doing it, probably not the most optimal, using regex. It assumes there is always one Sxx at each row. Assuming your DataFrame is data_df:

import pandas as pd
import re

last_col = list()
for index, row in data_df.iterrows():
for cell in row.to_list():
if re.match('S[0-9]+', cell):
last_col.append(cell)
break

data_df['last_col'] = last_col

Populate two columns based on different values of other two columns

This should do what your question asks:

import pandas as pd
import numpy as np
df = pd.DataFrame({'ID':[1,1,2,2,2,2,3,4,4], 'CURRENT':list('ABCDEFGHI')})
print(df)

from collections import defaultdict
valById = defaultdict(list)
df.apply(lambda x: valById[x['ID']].append(x['CURRENT']), axis = 1)
df = pd.DataFrame([{'ID':k, 'PREVIOUS': v[i-1] if i else np.nan, 'CURRENT': v[i], 'NEXT': v[i+1] if i+1 < len(v) else np.nan} for k, v in valById.items() for i in range(len(v))])
print(df)

Output:

   ID CURRENT
0 1 A
1 1 B
2 2 C
3 2 D
4 2 E
5 2 F
6 3 G
7 4 H
8 4 I
ID PREVIOUS CURRENT NEXT
0 1 NaN A B
1 1 A B NaN
2 2 NaN C D
3 2 C D E
4 2 D E F
5 2 E F NaN
6 3 NaN G NaN
7 4 NaN H I
8 4 H I NaN

How to populate a dataframe column based on condition met in another column

Your solution is possible if change ():

df['B'] = np.where(df['A']>2,'A1',
np.where(df['A'].between(0,2),'A2',
np.where(df['A'].between(-2,0),'A3',
np.where(df['A']<-2, 'A4',''))))

Alternative with cut:

df['B1'] = pd.cut(df['A'], bins=(-np.inf,-2,0,2,np.inf), labels=('A4','A3','A2','A1'))
print (df)
A B B1
0 -4.0 A4 A4
1 -3.5 A4 A4
2 -2.5 A4 A4
3 -1.0 A3 A3
4 1.0 A2 A2
5 1.5 A2 A2
6 2.0 A2 A2
7 2.5 A1 A1
8 3.5 A1 A1

How to populate a dataframe column based on a lookup of other columns?

Here's one way:

df['Parent_age'] = df.Parent.map(dict(df[['Child' , 'Age']].values))

# when Parent is not in Child column, then apply get_parent_age
cond = df['Parent_age'].isnull()
df.loc[cond, 'Parent_age'] = df.loc[cond, 'Parent'].map(get_parent_age)

populate column using loop based on value in row index 0

You can try to use pd.date_range:

# set your date column as index
df.set_index('date', inplace=True)

# generate dates for 7 days descending for periods equal to length of the dataframe
df.index = pd.date_range(start=df.index[0], freq='-7d', periods=df.shape[0])

This can be done without setting as an index as well.

df['date'] = pd.date_range(start=df.iloc[0]['date'], freq='-7d', periods=df.shape[0])


Related Topics



Leave a reply



Submit