cbind a dataframe with an empty dataframe - cbind.fill?
Here's a cbind fill:
cbind.fill <- function(...){
nm <- list(...)
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function (x)
rbind(x, matrix(, n-nrow(x), ncol(x)))))
}
Let's try it:
x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)
cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])
I think I stole this from somewhere.
EDIT STOLE FROM HERE: LINK
cbind error when merging list containing empty data frames
If there are empty list elements, we can remove those
i1 <- (sapply(b, nrow) > 0) & (sapply(a, nrow) > 0)
Map(cbind, b[i1], a[i1])
Or create the condition within Map
itself
out <- Map(function(x, y) if(nrow(x) > 0 & nrow(y) > 0) cbind(x, y), b, a)
assuming that otherwise the corresponding list
elements have the same number of rows
If we need to get the originaldataset in case one of them is empty, we can do
Map(function(x, y) if(is.null(x)|NROW(x) == 0) {
y} else if(is.null(y)|NROW(y) == 0) {
x} else cbind(x, y),
b, a)
data
a <- list(head(mtcars), data.frame(col1 = numeric(0)), head(iris))
b <- list(data.frame(col1 = numeric(0)), head(iris), head(mtcars))
cbind dataframe in R with placeholders
Something like this maybe?
Assuming ls()
# [1] "data.frame1" "data.frame2" "data.frame3"
as.data.frame(Reduce("cbind", sapply(ls(), function(i) get(i))))
Based on @akrun's comment, this can be simplified to
as.data.frame(Reduce("cbind", mget(ls())))
cbind a dataframe with an empty dataframe - cbind.fill?
Here's a cbind fill:
cbind.fill <- function(...){
nm <- list(...)
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function (x)
rbind(x, matrix(, n-nrow(x), ncol(x)))))
}
Let's try it:
x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)
cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])
I think I stole this from somewhere.
EDIT STOLE FROM HERE: LINK
Replacement of plyr::cbind.fill in dplyr?
Here's a way with some purrr
and dplyr
functions. Create column names to represent each data frame—since each has only one column, this is easy with setNames
, but with more columns you could use dplyr::rename
. Do a full-join across the whole list based on the original row names, and fill NA
s with 0.
library(dplyr)
library(purrr)
l1 %>%
imap(~setNames(.x, .y)) %>%
map(tibble::rownames_to_column) %>%
reduce(full_join, by = "rowname") %>%
mutate_all(tidyr::replace_na, 0)
#> rowname df1 df2 df3 df4
#> 1 A 1 0 9 4
#> 2 B 2 2 0 0
#> 3 C 3 0 3 0
#> 4 D 0 6 6 0
#> 5 E 0 0 0 12
cbind 2 dataframes with different number of rows
I think you should instead use merge
:
merge(df1, df2, by="year", all = T)
For your data:
df1 = data.frame(matrix(0, 7, 4))
names(df1) = c("year", "avg", "hr", "sal")
df1$year = 2010:2016
df1$avg = c(.3, .29, .275, .280, .295, .33, .315)
df1$hr = c(31, 30, 14, 24, 18, 26, 40)
df1$sal = c(2000, 4000, 600, 800, 1000, 7000, 9000)
df2 = data.frame(matrix(0, 5, 3))
names(df2) = c("year", "pos", "fld")
df2$year = c(2010, 2011, 2013, 2014, 2015)
df2$pos = c('A', 'B', 'C', 'B', 'D')
df2$fld = c(.99,.995,.97,.98,.99)
cbind
is meant to column-bind
two dataframes
that are in all sense compatible. But what you aim to do is actual merge
, where you want the elements from the two data frames not be discarded, and for missing values you get NA
instead.
Binding dataframes of different length (no cbind, no merge)
Edit: In case there are multiple df
. Do this
- Create a list of all dfs except one say first one
- use
purrr::reduce
to join all these together - pass first
df
in.init
argument.
df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df3 <- data.frame(bb = 101:110, cc = sample(letters, 10))
reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>%
select(-id)
v x m n bb cc
1 1 A 10 I 101 u
2 2 C 9 H 102 v
3 3 D 8 G 103 n
4 4 E 7 F 104 w
5 5 B NA <NA> 105 s
6 NA <NA> NA <NA> 106 y
7 NA <NA> NA <NA> 107 g
8 NA <NA> NA <NA> 108 i
9 NA <NA> NA <NA> 109 p
10 NA <NA> NA <NA> 110 h
Earlier Answer: Create a dummy column id
in both df
s and use full_join
full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>%
select(-id)
v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>
Results are different from as expected becuase of different random number seed
Or in BaseR
merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)
id v x m n
1 1 1 A 10 I
2 2 2 C 9 H
3 3 3 D 8 G
4 4 4 E 7 F
5 5 5 B NA <NA>
Remove extra column simply by subsetting []
merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1]
v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>
Related Topics
Filtering a Data Frame by Values in a Column
How to Convert a List Consisting of Vector of Different Lengths to a Usable Data Frame in R
How to Read Multiple (Excel) Files into R
Compare Two Data.Frames to Find the Rows in Data.Frame 1 That Are Not Present in Data.Frame 2
How to Add Texture to Fill Colors in Ggplot2
Difference Between '%In%' and '=='
Why Is It Not Advisable to Use Attach() in R, and What Should I Use Instead
Split a Large Dataframe into a List of Data Frames Based on Common Value in Column
Installing Older Version of R Package
How to Succinctly Write a Formula With Many Variables from a Data Frame
Cleaning Up Factor Levels (Collapsing Multiple Levels/Labels)
How to Subset Matrix to One Column, Maintain Matrix Data Type, Maintain Row/Column Names
Create a Sequential Number (Counter) For Rows Within Each Group of a Dataframe
Reshape Multiple Value Columns to Wide Format
Add a Common Legend For Combined Ggplots
How to Spread Repeated Measures of Multiple Variables into Wide Format