Transfer Values from One Dataframe to Another

Transfer values from one dataframe to another

Using data.table:

require(data.table)
dt1 <- data.table(df1, key="id")
dt2 <- data.table(df2)

dt1[dt2$id, value]

# id value
# 1: 1 1.000000
# 2: 2 6.210526
# 3: 3 11.421053
# 4: 4 16.631579
# 5: 5 21.842105
# 6: 21 NA
# 7: 22 NA
# 8: 23 NA

or using base merge as @TheodoreLytras mentioned under comment:

# you don't need to have `v2` column in df2
merge(df2, df1, by="id", all.x=T, sort=F)

# id v2 value
# 1 1 NA 1.000000
# 2 2 NA 6.210526
# 3 3 NA 11.421053
# 4 4 NA 16.631579
# 5 5 NA 21.842105
# 6 21 NA NA
# 7 22 NA NA
# 8 23 NA NA

Copying a column from one DataFrame to another gives NaN values?

The culprit is unalignable indexes

Your DataFrames' indexes are different (and correspondingly, the indexes for each columns), so when trying to assign a column of one DataFrame to another, pandas will try to align the indexes, and failing to do so, insert NaNs.

Consider the following examples to understand what this means:

# Setup
A = pd.DataFrame(index=['a', 'b', 'c'])
B = pd.DataFrame(index=['b', 'c', 'd', 'f'])
C = pd.DataFrame(index=[1, 2, 3])
# Example of alignable indexes - A & B (complete or partial overlap of indexes)
A.index B.index
a
b b (overlap)
c c (overlap)
d
f
# Example of unalignable indexes - A & C (no overlap at all)
A.index C.index
a
b
c
1
2
3

When there are no overlaps, pandas cannot match even a single value between the two DataFrames to put in the result of the assignment, so the output is a column full of NaNs.

If you're working on an IPython notebook, you can check that this is indeed the root cause using,

df1.index.equals(df2.index)
# False
df1.index.intersection(df2.index).empty
# True

You can use any of the following solutions to fix this issue.

Solution 1: Reset both DataFrames' indexes

You may prefer this option if you didn't mean to have different indices in the first place, or if you don't particularly care about preserving the index.

# Optional, if you want a RangeIndex => [0, 1, 2, ...]
# df1.index = pd.RangeIndex(len(df))
# Homogenize the index values,
df2.index = df1.index
# Assign the columns.
df2[['date', 'hour']] = df1[['date', 'hour']]

If you want to keep the existing index, but as a column, you may use reset_index() instead.



Solution 2: Assign NumPy arrays (bypass index alignment)

This solution will only work if the lengths of the two DataFrames match.

# pandas >= 0.24
df2['date'] = df1['date'].to_numpy()
# pandas < 0.24
df2['date'] = df1['date'].values

To assign multiple columns easily, use,

df2[['date', 'hour']] = df1[['date', 'hour']].to_numpy()

R: How can I transfer values from one data frame to another data frame depending on certain circumstances?

You need to join the 2 tables up, there are lots of methods and packages to do this but I am always a fan of the tidyverse, in this case dplyr joins.

Without seeing your table specifics it will look something like this.

df_joined <- left_join(df1, df2, by = c("Country of Birth" = "Country", "Year of Birth" = "Year")

How to move values from one dataframe to another in pandas?

You really don't need df2 here. You can compute the result directly from df using some simple reshaping functions set_index, unstack and reindex. You just need the symbols list.

(df.assign(Shares=np.where(df.Order == 'BUY', df.Shares, -df.Shares))
.drop('Order', 1)
.set_index('Symbol', append=True)['Shares']
.unstack(1)
.reindex(df2.columns, axis=1)) # you can replace df2.columns with a list

GOOG AAPL XOM IBM Cash
Date
2009-01-14 NaN 150.0 NaN NaN NaN
2009-01-21 NaN -150.0 NaN 400.0 NaN

Copy value from one dataframe to another based on multiple column index

You can use DataFrame.merge by select 2 columns in df1 and no on parameter for merge by intersection of columns:

df = df1[['item','shop']].merge(df2)

So it working same like:

df = df1[['item','shop']].merge(df2, on=['item','shop'])

Your solution should be changed with DataFrame.set_index by 2 columns for MultiIndex:

df11 = df1.set_index(['item','shop'])
df11.update(df2.set_index(['item','shop']))
df = df11.reset_index()

Copy contents from one Dataframe to another based on column values in Pandas

Building off of Rabinzel's answer:

output = df2.merge(df1, how='left', on='First Name', suffixes=[None, '_old'])
df3 = output[['First Name', 'Age', 'Gender', 'Weight', 'Height']]

cols = df1.columns[1:-1]
modval = pd.DataFrame()
for col in cols:
modval = pd.concat([modval, output[['First Name', col + '_old']][output[col] != output[col + '_old']].dropna()])
modval.rename(columns={col +'_old':col}, inplace=True)

newentries = df2[~df2['First Name'].isin(df1['First Name'])]
deletedentries = df1[~df1['First Name'].isin(df2['First Name'])]

print(df3, newentries, deletedentries, modval, sep='\n\n')

Output:

  First Name  Age  Gender  Weight Height
0 James 25 Male 155 5'10
1 John 27 Male 175 5'9
2 Patricia 23 Female 135 5'3
3 Mary 22 Female 125 5'4
4 Martin 30 Male 185 NaN
5 Margaret 29 Female 141 NaN
6 Kevin 22 Male 198 6'2

First Name Age Gender Weight
4 Martin 30 Male 185
5 Margaret 29 Female 141

First Name Age Gender Weight Height
2 Matthew 29 Male 183 6'0
5 Rachel 29 Male 123 5'3
6 Jose 20 Male 175 5'11

First Name Age Gender Weight
0 James NaN NaN 165.0
6 Kevin NaN NaN 192.0

How to transfer values from one dataframe to another?

Do you mean, join once on ID and X_A to get X_B, and afterwards ID and Y_A to get Y_B? Note that row 10 is different:

df2 %>% 
left_join(select(df1, ID, X_A, X_B),
by = c("ID", "X_A")) %>%
left_join(select(df1, ID, Y_A, Y_B),
by = c("ID", "Y_A"))

# ID X_A Y_A X_B Y_B
# 1 A 1 1 1 1
# 2 A 2 2 2 2
# 3 A 3 3 3 NA
# 4 A 4 4 4 NA
# 5 A 5 5 5 NA
# 6 A 6 6 NA NA
# 7 A 7 7 NA NA
# 8 A 8 8 NA NA
# 9 A 9 9 NA NA
# 10 A 10 10 NA 10
# 11 B 1 1 NA NA
# 12 B 2 2 NA NA
# 13 B 3 3 NA NA
# 14 B 4 4 NA NA
# 15 B 5 5 NA NA
# 16 B 6 6 NA NA
# 17 B 7 7 NA NA
# 18 B 8 8 8 8
# 19 B 9 9 9 9
# 20 B 10 10 10 10

Base R:

want <- merge(df2, subset(df1, select = c(ID, X_A, X_B)), by = c("ID", "X_A"), all.x = TRUE)
(want <- merge(want, subset(df1, select = c(ID, Y_A, Y_B)), by = c("ID", "Y_A"), all.x = TRUE))

how to pass value from one dataframe to another dataframe?

Store sql result into a variable using mkString and then use the variable in your where clause.

Example:

val df=Seq((1,"a"),(2,"b")).toDF("CID","n")
df.createOrReplaceTempView("AAA")

val df1=Seq((1,"a"),(2,"b")).toDF("C_ID","j")
df1.createOrReplaceTempView("NST")

val a=spark.sql("select max(CID) from AAA").collect()(0).mkString
spark.sql(s"select * from NST where C_ID=${a}").show()

#+----+---+
#|C_ID| j|
#+----+---+
#| 2| b|
#+----+---+

python - how do I transfer values from one df to another

df = pd.DataFrame({
'Rating' : ['A', 'AAA', 'AA', 'BBB', 'BB', 'B'],
'val' : [4560.0, 64.0, 456.0, 34.0, 534.0, 54.0]
})
df
###
Rating val
0 A 4560.0
1 AAA 64.0
2 AA 456.0
3 BBB 34.0
4 BB 534.0
5 B 54.0

Keeping df1 as yours, but don't set_index() additionally.

df1 = pd.DataFrame(['AA','AA','AA','AA','A','A'],columns=['Rating'])
df1
###
Rating
0 AA
1 AA
2 AA
3 AA
4 A
5 A

Doing the merge()

df1 = df1.merge(df,left_on='Rating', right_on='Rating')
df1
###
Rating val
0 AA 456.0
1 AA 456.0
2 AA 456.0
3 AA 456.0
4 A 4560.0
5 A 4560.0

Then set_index()

df1.set_index('Rating', inplace=True)
df1
###
val
Rating
AA 456.0
AA 456.0
AA 456.0
AA 456.0
A 4560.0
A 4560.0


With different df1

df1 = pd.DataFrame(['AA', 'A', 'A', 'A', 'AA', 'AA'], columns=['Rating'])
df1
###
Rating
0 AA
1 A
2 A
3 A
4 AA
5 AA

Doing the merge()

df1 = df1.merge(df,left_on='Rating', right_on='Rating', how='left')
df1
###
Rating val
0 AA 456.0
1 A 4560.0
2 A 4560.0
3 A 4560.0
4 AA 456.0
5 AA 456.0


Related Topics



Leave a reply



Submit