Interleave columns of two data frames
You got the first step right, which is cbind
ing. Let's say your data frames are d1
with n columns A1, A2, ...,An
and d1
with n columns B1, B2, ..., Bn
. Then d <- cbind(d1, d2)
will give you the data frame containing all the information you wanted, and you just need to re-order its columns. This amounts to generating a vector (1, (n+1), 2, (n+2), ..., n, 2n) to index the data frame columns. You can do this as s <- rep(1:n, each = 2) + (0:1) * n
. So finally, your desired data frame is d[s]
.
Interleave 2 Dataframes on certain columns
If columns names are same, only difference is some new columns names in one of DataFrame is possible use:
df3 = pd.DataFrame(interleave([df1.values, df2.values]), columns=df1.columns)
print (df3)
StartLocation StartDevice StartPort EndLocation EndDevice EndPort LinkType \
0 DD1 Switch1 P1 AD1 Switch2 P2 MTP
1 AB11 RU15 P1 AJ11 RU25 P2 None
2 DD2 Switch2 P3 AD2 Switch3 P2 MTP
3 AB12 RU18 P2 AB11 RU35 P2 None
4 DD3 Switch3 P5 AD3 Switch4 P6 MTP
5 AB13 RU19 P3 AB11 RU40 P4 None
Speed
0 1000.0
1 NaN
2 1000.0
3 NaN
4 1000.0
5 NaN
More general solution working for any columns names is use DataFrame.align
before for prevent correct align columns for each DataFrame:
print (df1)
EndDevice EndLocation EndPort LinkType Speed StartDevice StartLocation \
0 Switch2 AD1 P2 MTP 1000 Switch1 DD1
1 Switch3 AD2 P2 MTP 1000 Switch2 DD2
2 Switch4 AD3 P6 MTP 1000 Switch3 DD3
StartPort
0 P1
1 P3
2 P5
print (df2)
EndDevice EndLocation EndPort LinkType Speed StartDevice StartLocation \
0 RU25 AJ11 P2 NaN NaN RU15 AB11
1 RU35 AB11 P2 NaN NaN RU18 AB12
2 RU40 AB11 P4 NaN NaN RU19 AB13
StartPort
0 P1
1 P2
2 P3
df3 = pd.DataFrame(interleave([df1.values, df2.values]), columns=df1.columns)
print (df3)
EndDevice EndLocation EndPort LinkType Speed StartDevice StartLocation \
0 Switch2 AD1 P2 MTP 1000.0 Switch1 DD1
1 RU25 AJ11 P2 NaN NaN RU15 AB11
2 Switch3 AD2 P2 MTP 1000.0 Switch2 DD2
3 RU35 AB11 P2 NaN NaN RU18 AB12
4 Switch4 AD3 P6 MTP 1000.0 Switch3 DD3
5 RU40 AB11 P4 NaN NaN RU19 AB13
StartPort
0 P1
1 P1
2 P3
3 P2
4 P5
5 P3
Another idea with Index.union
and DataFrame.reindex
:
cols = df1.columns.union(df2.columns, sort=False)
df1 = df1.reindex(cols, axis=1)
df2 = df2.reindex(cols, axis=1)
print (df1)
StartLocation StartDevice StartPort EndLocation EndDevice EndPort LinkType \
0 DD1 Switch1 P1 AD1 Switch2 P2 MTP
1 DD2 Switch2 P3 AD2 Switch3 P2 MTP
2 DD3 Switch3 P5 AD3 Switch4 P6 MTP
Speed
0 1000
1 1000
2 1000
print (df2)
StartLocation StartDevice StartPort EndLocation EndDevice EndPort LinkType \
0 AB11 RU15 P1 AJ11 RU25 P2 NaN
1 AB12 RU18 P2 AB11 RU35 P2 NaN
2 AB13 RU19 P3 AB11 RU40 P4 NaN
Speed
0 NaN
1 NaN
2 NaN
df3 = pd.DataFrame(interleave([df1.values, df2.values]), columns=cols)
print (df3)
StartLocation StartDevice StartPort EndLocation EndDevice EndPort LinkType \
0 DD1 Switch1 P1 AD1 Switch2 P2 MTP
1 AB11 RU15 P1 AJ11 RU25 P2 NaN
2 DD2 Switch2 P3 AD2 Switch3 P2 MTP
3 AB12 RU18 P2 AB11 RU35 P2 NaN
4 DD3 Switch3 P5 AD3 Switch4 P6 MTP
5 AB13 RU19 P3 AB11 RU40 P4 NaN
Speed
0 1000.0
1 NaN
2 1000.0
3 NaN
4 1000.0
5 NaN
R interleave two data frames with same column names
An option is to cbind
and use setcolorder
on the order
ed column names concatenate and then use make.unique
if the intention is to identify the before/after on the duplicate column names
library(data.table)
out <- setcolorder(cbind(dt1, dt2), order(c(names(dt1), names(dt2))))[]
setnames(out, make.unique(names(out)))[]
out[, setdiff(names(dt1), names(dt2)) := NULL][]
# a.before a.after c.before c.after d
#1: 1 a 1 a 1
#2: 2 b 2 b 2
#3: 3 c 3 c 3
If we need to specifically use before/after
out <- setcolorder(cbind(dt1, dt2), order(c(names(dt1), names(dt2))))[]
out[, setdiff(names(dt1), names(dt2)) := NULL][]
i1 <- duplicated(names(out), fromLast = TRUE)
i2 <- duplicated(names(out))
names(out)[i1] <- paste0(names(out)[i1], ".before")
names(out)[i2] <- paste0(names(out)[i2], ".after")
out
# a.before a.after c.before c.after d
#1: 1 a 1 a 1
#2: 2 b 2 b 2
#3: 3 c 3 c 3
How can I interleave rows from 2 data frames together?
Assign row numbers to each data frame independently, then bind the rows and sort/arrange by row number and data frame id. In this example, row numbers are trivial since the ids are sequential and act as row number. But in the general case, row numbers should be used.
Here's an example using dplyr:
df1 %>%
mutate(row_number = row_number()) %>%
bind_rows(df2 %>% mutate(row_number = row_number())) %>%
arrange(row_number, df)
Output:
df id chr row_number
(dbl) (int) (chr) (int)
1 1 1 puppies 1
2 2 1 kitties 1
3 1 2 puppies 2
4 2 2 kitties 2
5 1 3 puppies 3
6 2 3 kitties 3
7 1 4 puppies 4
8 2 4 kitties 4
9 1 5 puppies 5
10 2 5 kitties 5
Interleaving two data.frames of data.frames in R
We need to use Map
for interleave
ing corresponding elements of both list
s
library(gdata)
out_lst <- Map(interleave, new_list, second_list)
Or another option is map2
from purrr
library(purrr)
out_lst <- map2(new_list, second_list, interleave)
Interweave two dataframes
Using pd.concat
to combine the DataFrames, and toolz.interleave
reorder the columns:
from toolz import interleave
pd.concat([d1, d2], axis=1)[list(interleave([d1, d2]))]
The resulting output is as expected:
0 3 1 4 2
a 1 0 1 0 1
b 1 0 1 0 1
c 1 0 1 0 1
Combining two dataframes with alternating column position
We can use the matrix
route to bind the column names into a dim
structure and then concatenate (c
)
library(dplyr)
bind_cols(df1, df2) %>%
dplyr::select(all_of(c(matrix(names(.), ncol = 3, byrow = TRUE))))
-output
# A tibble: 4 × 6
b b_B a a_A c c_C
<int> <int> <int> <int> <int> <int>
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
Interleave two columns of a data.frame
First let's create some data:
dd = data.frame(x = 1:10, y = LETTERS[1:10])
Next, we need to make sure the y
column is a character and not a factor (otherwise, it will be converted to a numeric)
dd$y = as.character(dd$y)
Then we transpose the data frame and convert to a vector:
as.vector(t(dd))
However, a more pertinent question is why you would want to do this.
Pandas - Interleave / Zip two DataFrames by row
You can sort the index after concatenating and then reset the index i.e
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)
Output :
0 1 2
0 a b c
1 A B C
2 d e f
3 D E F
EDIT (OmerB) : Incase of keeping the order regardless of the index value then.
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()
concat_df = pd.concat([df1,df2]).sort_index().set_index('index')
Related Topics
Dummy Variables to Single Categorical Variable (Factor) in R
Drawing Minor Ticks (Not Grid Ticks) in Ggplot2 in a Date Format Axis
Calculating Inter-Purchase Time in R
Labelling Points with Ggplot2 and Directlabels
R Ggplot2 Using Italics and Non-Italics in the Same Category Label
Splitting Text to Words with R and Csplit()
Check Which Elements of a Vector Is Between the Elements of Another One in R
R - Help in Converting Factor to Date (%M/%D/%Y %H:%M)
Create a Concentric Circle Legend for a Ggplot Bubble Chart
Combining .Sd with Renamed Variable Messes with Names of .Sd Columns
R: Finding the Intersect of Two Lines