Add a New Column to a Dataframe Using Matching Values of Another Dataframe

Add a new column to a dataframe using matching values of another dataframe

merge(table1, table2[, c("pid", "val2")], by="pid")

Add in the all.x=TRUE argument in order to keep all of the pids in table1 that don't have matches in table2...

You were on the right track. Here's a way using match...

table1$val2 <- table2$val2[match(table1$pid, table2$pid)]

Create new column in dataframe with match values from other dataframe

Need map by Series first for new column:

df1['Total2'] = df1['Name'].map(df2.set_index('Name')['Total2'])
print (df1)
Name Total Total2
0 Accounting 3 120
1 Reporting 1 100
2 Finance 1 400
3 Audit 1 500
4 Template 2 1200

And then set_index with DataFrame.plot.bar:

df1.set_index('Name').plot.bar()

Append a new column on a dataframe based on other dataframe with matching rows and fill the non-matching ones with value from the existing column

Use numpy.where :

df1['new'] = np.where(df1['UserID'].isin(df2['UserID']), 'test-Red', 
np.where(df1['group'] == 'test','test-NonRed',df1['group']))
print (df1)
UserID group day sp PU new
0 213 test 12/11/14 3 311 test-Red
1 314 control 13/11/14 4 345 control
2 354 test 13/08/14 5 376 test-NonRed

Or numpy.select:

m1 = df1['UserID'].isin(df2['UserID'])
m2 = df1['group'] == 'test'
df1['new'] = np.select([m1,m2], ['test-Red', 'test-NonRed'],default=df1['group'])
print (df1)
UserID group day sp PU new
0 213 test 12/11/14 3 311 test-Red
1 314 control 13/11/14 4 345 control
2 354 test 13/08/14 5 376 test-NonRed

More general solution:

print (df1)
UserID group day sp PU
0 213 test 12/11/14 3 311
1 314 control 13/11/14 4 345
2 354 test 13/08/14 5 376
3 2131 test1 12/11/14 3 311
4 314 control1 13/11/14 4 345
5 354 test1 13/08/14 5 376

df2 = pd.DataFrame({'UserID':[213, 2131]})


m1 = df1['UserID'].isin(df2['UserID'])
m2 = df1['group'].isin(df1.loc[m1, 'group'])

df1['new'] = np.select([m1,m2],
[df1['group'] + '-Red', df1['group'] + '-NonRed'],
default=df1['group'])
print (df1)
UserID group day sp PU new
0 213 test 12/11/14 3 311 test-Red
1 314 control 13/11/14 4 345 control
2 354 test 13/08/14 5 376 test-NonRed
3 2131 test1 12/11/14 3 311 test1-Red
4 314 control1 13/11/14 4 345 control1
5 354 test1 13/08/14 5 376 test1-NonRed

New column based on matching values from another dataframe pandas

Check with stack df1's list columns after re-create with DataFrame then map the value from df2


Also since you asking not using for loop I am using sum , and sum for this case is much slower than *for loop* or itertools


s=pd.DataFrame(df1.column2.tolist()).stack()
df1['New']=s.map(df2.set_index('column3').column4).sum(level=0).apply(set)
df1
Out[36]:
column1 column2 New
0 a1 [A, B] {2, 4, 3, 1}
1 a2 [A, B, C] {3, 5, 4, 2, 1}
2 a3 [B, C] {4, 3, 1, 5}

As I mentioned and most of us suggested , also you can check with For loops with pandas - When should I care?

import itertools
d=dict(zip(df2.column3,df2.column4))


l=[set(itertools.chain(*[d[y] for y in x ])) for x in df1.column2.tolist()]
df1['New']=l

How to add new column from another dataframe based on values in column of first dataframe?

This is done via a join operation which in pandas can be done with .merge().

Kindly try using the following:

df = df.merge(population,how='left',on='Province')

Also please consider reading the following answer for a detailed guide on joins and merges

Pandas populate new dataframe column based on matching columns in another dataframe

Consider the following dataframes df and df2

df = pd.DataFrame(dict(
AUTHOR_NAME=list('AAABBCCCCDEEFGG'),
title= list('zyxwvutsrqponml')
))

df2 = pd.DataFrame(dict(
AUTHOR_NAME=list('AABCCEGG'),
title =list('zwvtrpml'),
CATEGORY =list('11223344')
))

option 1

merge

df.merge(df2, how='left')

option 2

join

cols = ['AUTHOR_NAME', 'title']
df.join(df2.set_index(cols), on=cols)

both options yield

Sample Image

R: How do I add a column to a dataframe based on paired values in another dataframe with different length?

Using left_join from dplyr package:

library(dplyr)

data <- left_join(df1,df2, by="id")

Adding value to new column based on if two other columns match

I tried with this little dataset and it's working, let me know if you have some problems

df1 = pd.DataFrame({"title" :   ["Biocryst ($BCRX) continues to remain undervalued", "AAPL is good, buy it"], 'score' : [120,420]   , 'Stock_name' : ["",""]    })
df2 = pd.DataFrame({'Company name' : ['BioCryst Pharmaceuticals, Inc.','GameStop','Apple Inc.'], 'Symbol' : ["BCRX","GME","AAPL"]})

df1
title score Stock_name
0 Biocryst ($BCRX) continues to remain undervalued 120
1 AAPL is good, buy it 420


df2
Company name Symbol
0 BioCryst Pharmaceuticals, Inc. BCRX
1 GameStop GME
2 Apple Inc. AAPL

for j in range(0,len(df2)):
for i in range(0,len(df1)):
if df2['Symbol'][j] in df1['title'][i]:
df1['Stock_name'][i] = df2['Symbol'][j]


df1

title score Stock_name
0 Biocryst ($BCRX) continues to remain undervalued 120 BCRX
1 AAPL is good, buy it 420 AAPL


Related Topics



Leave a reply



Submit