Merge Two Data Frames While Keeping the Original Row Order

Merge two data frames while keeping the original row order

Check out the join function in the plyr package. It's like merge, but it allows you to keep the row order of one of the data sets. Overall, it's more flexible than merge.

Using your example data, we would use join like this:

> join(df.2,df.1)
Joining by: class
  object class prob
1      A     2  0.7
2      B     1  0.5
3      D     2  0.7
4      F     3  0.3
5      C     1  0.5

Here are a couple of links describing fixes to the merge function for keeping the row order:

http://www.r-statistics.com/2012/01/merging-two-data-frame-objects-while-preserving-the-rows-order/

http://r.789695.n4.nabble.com/patching-merge-to-allow-the-user-to-keep-the-order-of-one-of-the-two-data-frame-objects-merged-td4296561.html

In R, merge 2 dataframes while maintaining the row order of the first dataframe

You could use join from plyr

library(plyr)
plyr::join(df1,df2, by='global.player.id')

The result is not sorted.

Join/merge dataframes and preserve the row-order

One quick way is:

df_2=df_2.set_index(['A','B'])

temp = df_1.set_index(['A','B'])

df_2.update(temp)

df_2.reset_index(inplace=True)

As I discuss above with @jezrael above and if I am not missing something, if you do not need both the columns C from the original dataframes and you need only the column C with the matching values then .update() is the quickest way since you do not have to drop the columns that you do not need.

merge two DataFrame with two columns and keep the same order with original indexes in the result

when constructing the merged dataframe, get the index values from each dataframe.

merged_df = pd.merge(df1, df2, how="outer", on=['key1', 'key2'])

use combine_first to combine index_x & index_y

merged_df['combined_index'] =merged_df.index_x.combine_first(merged_df.index_y)

sort using combined_index & index_x dropping columns which are not needed & resetting index.

output = merged_df.sort_values(
    ['combined_index', 'index_x']
).drop(
    ['index_x', 'index_y', 'combined_index'], axis=1
).reset_index(drop=True)

This results in the following output:

  key1 key2  Value1  Value2
0    K   a5   apple     NaN
1    K   a9     NaN   apple
2    K   a4   guava     NaN
3   A1   a7    kiwi    kiwi
4   A3   a9     NaN   grape
5   A2   a9   grape     NaN
6   B1   b2  banana  banana
7   C2   c7     NaN   guava
8   B9   b8   peach     NaN
9   C3   c1   berry  orange

How can I merge and maintain the row order of one input?

You can do this with match and subsetting key by the result:

bottles <- key[match(samp, key$num),]
# rownames are odd because they must be unique, clean them up
rownames(bottles) <- seq(NROW(bottles))

Merge data.tables while keeping original order in R

Solution using dplyr:

library(data.table)

set.seed(100)

dt <- data.table(g1=c("A", "B", "C", "D", "E", "F", "L", "O", "P", "J"), 
                 g2=c("G", "D", "C", "H", "K", "J", "L", "U", "I", "R"),
                 value= rnorm(10))

ids <- data.table(labels=c("A", "B", "C", "D", "E", "F", "L", "O", 
                           "P", "J", "G", "H", "K", "U", "I", "R"),
                  ids=c(1:16))

dt %>% 
  left_join(ids, by= c("g1"="labels")) %>% 
  mutate(label_match = g1 == g2)

Which returns:

    g1 g2      value ids label_match
1   A  G -0.50219235   1       FALSE
2   B  D  0.13153117   2       FALSE
3   C  C -0.07891709   3        TRUE
4   D  H  0.88678481   4       FALSE
5   E  K  0.11697127   5       FALSE
6   F  J  0.31863009   6       FALSE
7   L  L -0.58179068   7        TRUE
8   O  U  0.71453271   8       FALSE
9   P  I -0.82525943   9       FALSE
10  J  R -0.35986213  10       FALSE

Merge two data frames while keeping a certain row

UPDATE:

In [139]: df[df.ColumnA.isin(df1.ColumnB)].append(df.loc['row_to_keep'])
Out[139]:
                   ColumnA  Stats
0                     Cake    872
1              Cheese Cake    912
3            Raspberry Jam     91
4                    Bacon    123
row_to_keep            NaN    999

Old answer:

Here is one solution:

In [126]: df.merge(df1, left_on="ColumnA", right_on="ColumnB").append(df.loc['row_to_keep'])
Out[126]:
                   ColumnA  Stats        ColumnB
0                     Cake    872           Cake
1              Cheese Cake    912    Cheese Cake
2            Raspberry Jam     91  Raspberry Jam
3                    Bacon    123          Bacon
row_to_keep            NaN    999            NaN

Explanation:

df.loc['row_to_keep'] selects one row by index value ('row_to_keep') and DF.append(row) - appends it to the merged DF

I must admit though, there might be less ugly solutions...

Merge data frames while keeping length of one and values of other in R

We can use match to find the positions of the row names of Y that are found in X. The values of Y are put into a vector and concatenated with 0. We use the nomatch argument to fill in 0 when there is no match. This returns z as a vector:

Z <- c(unlist(Y, use.names=FALSE), 0)[match(row.names(X), row.names(Y), nomatch=4L)]
Z
[1]  0  0  0 20  0 30  0 40  0  0

To get a data.frame

Z <- data.frame(Z)

Match 2 data frames based on common rows, and preserving the order of rownames

With data.table, you can do this:

library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]

Output:

    a   b
1:  5 Ccd
2:  9 Kkl
3: 13 Sop
4:  0 Mnn
5:  5 Msg
6:  0 Xxy
7:  0 Zxz
8:  5 Ccd
9:  5 Msg

Or with dplyr:

library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))

Output:

     b  a
1: Ccd  5
2: Kkl  9
3: Sop 13
4: Mnn  0
5: Msg  5
6: Xxy  0
7: Zxz  0
8: Ccd  5
9: Msg  5

Input:

df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy", 
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")

df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab", 
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA, 
-7L), class = "data.frame")

Merge nth elements from two columns while keeping the original row order in R

We could do it with an ifelse statement checking if row is even or odd with the modulo operator %%:

library(dplyr)
df %>% 
  mutate(col3 = ifelse((row_number() %% 2) == 0, col2, col1))

  col1 col2 col3
1    A    2    A
2    B    1    1
3    D    2    D
4    F    3    3
5    C    1    C

Merge Two Data Frames While Keeping the Original Row Order