Repeating rows of data.frame in dplyr
This is rife with peril if the data.frame has other columns (there, I said it!), but the do
block will allow you to generate a derived data.frame within a dplyr
pipe (though, ceci n'est pas un pipe):
library(dplyr)
df <- data.frame(column = letters[1:4], stringsAsFactors = FALSE)
df %>%
do( data.frame(column = rep(.$column, each = 4), stringsAsFactors = FALSE) )
# column
# 1 a
# 2 a
# 3 a
# 4 a
# 5 b
# 6 b
# 7 b
# 8 b
# 9 c
# 10 c
# 11 c
# 12 c
# 13 d
# 14 d
# 15 d
# 16 d
As @Frank suggested, a much better alternative could be
df %>% slice(rep(1:n(), each=4))
R dplyr repeat dataframe rows by group
Give only one value to times
argument in rep
. Since you want to do this by group you can use any value from ntimes
column.
library(dplyr)
df %>% group_by(my.group) %>% slice(rep(1:n(), first(ntimes)))
#Similar other variations could be
#df %>% group_by(my.group) %>% slice(rep(seq_len(n()), first(ntimes)))
#df %>% group_by(my.group) %>% slice(rep(seq_along(ntimes), first(ntimes)))
# my.group vals ntimes
# <fct> <dbl> <int>
# 1 a 0.110 3
# 2 a 0.273 3
# 3 a 0.491 3
# 4 a 0.110 3
# 5 a 0.273 3
# 6 a 0.491 3
# 7 a 0.110 3
# 8 a 0.273 3
# 9 a 0.491 3
#10 b 0.318 1
#11 b 0.559 1
#12 b 0.263 1
#13 z 0.202 2
#14 z 0.388 2
#15 z 0.888 2
#16 z 0.202 2
#17 z 0.388 2
#18 z 0.888 2
Doing this in base R is surprisingly convulated or maybe there is a way which I can't figure out
df[unlist(Map(rep, split(1:nrow(df), df$my.group),
tapply(df$ntimes, df$my.group, head, 1))), ]
data
df <- structure(list(my.group = structure(c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L), .Label = c("a", "b", "z"), class = "factor"), vals = c(0.110453,
0.2732849, 0.4905132, 0.318404, 0.5591728, 0.2625931, 0.2018752,
0.3875257, 0.8878698), ntimes = c(3L, 3L, 3L, 1L, 1L, 1L, 2L,
2L, 2L)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9"))
R: Repeating row of dataframe with respect to multiple count columns
Here is a tidyverse
option. We can use uncount
from tidyr
to duplicate the rows according to the count in value
(i.e., from the var
columns) after pivoting to long format.
library(tidyverse)
df %>%
pivot_longer(starts_with("var"), names_to = "class") %>%
filter(value != 0) %>%
uncount(value) %>%
mutate(class = str_extract(class, "\\d+"))
Output
f1 f2 class
<chr> <chr> <chr>
1 a c 1
2 a c 3
3 a c 3
4 a c 3
5 b d 1
6 b d 2
7 b d 2
Another slight variation is to use expandrows
from splitstackshape
in conjunction with tidyverse
.
library(splitstackshape)
df %>%
pivot_longer(starts_with("var"), names_to = "class") %>%
filter(value != 0) %>%
expandRows("value") %>%
mutate(class = str_extract(class, "\\d+"))
R: Create duplicate rows based on a variable (dplyr preferred)
A nice tidyr
function for this is uncount()
:
df %>%
uncount(sales) %>%
rename(salesTime = time)
salesTime
1 0
2 1
3 2
3.1 2
4 3
5 4
6 5
6.1 5
6.2 5
Repeat/duplicate specific row of data frame and append
You could select the row that you want to duplicate and add it to original dataframe :
library(dplyr)
var1_variable <- 'A'
df %>%
filter(var1 == var1_variable) %>%
slice_max(var2, n = 1) %>%
#For dplyr < 1.0.0
#slice(which.max(var2)) %>%
bind_rows(df, .)
# var1 var2 val
#1 A 1 21
#2 A 2 31
#3 A 3 54
#4 B 4 65
#5 B 5 76
#6 A 3 54
In base R, that can be done as :
df1 <- subset(df, var1 == var1_variable)
rbind(df, df1[which.max(df1$var2), ])
From this post we can save the previous work in a temporary variable and then bind rows so that we don't break the chain and don't bind the original dataframe df
.
df %>%
#Previous list of commands
{
{. -> temp} %>%
filter(var1 == var1_variable) %>%
slice_max(var2, n = 1) %>%
bind_rows(temp)
}
Add column but duplicate all other row values in dataframe in R
You can repeat each row index 24 times and then assign new hour
column from 1 to 24 using recycling techinique.
newdata <- mydata[rep(seq_len(nrow(mydata)), each = 24),]
newdata$hour <- 1:24
Couple of tidyverse
options :
library(dplyr)
mydata %>% tidyr::uncount(24) %>% group_by(Day) %>% mutate(hour = 1:24)
and
mydata %>% group_by(Day) %>% slice(rep(row_number(), 24)) %>% mutate(hour = 1:24)
Repeat rows with specific value
You can create a new column specifying number of times a row should be repeated and then use uncount
to repeat them.
library(dplyr)
library(tidyr)
df %>%
mutate(repeat_row = ifelse(name1 %in% c('x', 'y'), 2, 1)) %>%
uncount(repeat_row)
# name1 name2
#1 x 0
#2 x 0
#3 y 1
#4 y 1
#5 z 2
Repeat rows of a data.frame N times
EDIT: updated to a better modern R answer.
You can use replicate()
, then rbind
the result back together. The rownames are automatically altered to run from 1:nrows.
d <- data.frame(a = c(1,2,3),b = c(1,2,3))
n <- 3
do.call("rbind", replicate(n, d, simplify = FALSE))
A more traditional way is to use indexing, but here the rowname altering is not quite so neat (but more informative):
d[rep(seq_len(nrow(d)), n), ]
Here are improvements on the above, the first two using purrr
functional programming, idiomatic purrr:
purrr::map_dfr(seq_len(3), ~d)
and less idiomatic purrr (identical result, though more awkward):
purrr::map_dfr(seq_len(3), function(x) d)
and finally via indexing rather than list apply using dplyr
:
d %>% slice(rep(row_number(), 3))
Repeat rows of a data.frame
df <- data.frame(a = 1:2, b = letters[1:2])
df[rep(seq_len(nrow(df)), each = 2), ]
Related Topics
Add a New Column Between Other Dataframe Columns
Double Clustered Standard Errors for Panel Data
How to Read Data with Different Separators
Harnessing .F List Names with Purrr::Pmap
Read Multiple Xlsx Files with Multiple Sheets into One R Data Frame
Add Regression Plane to 3D Scatter Plot in Plotly
Changing the Symbol in the Legend Key in Ggplot2
How to One-Hot-Encode Factor Variables with Data.Table
Format Text Inside R Code Chunk
Add an Image to a Table-Like Output in R
How to Handle Vectors Without Knowing the Type in Rcpp
Ternary Plot and Filled Contour
Knit One Markdown File to Two Output Files
How to Output Text to the R Console in Color
Display a Matrix, Including the Values, as a Heatmap
Convert Latitude and Longitude Coordinates to Country Name in R
When and Why Does "Print" Need Two Attempts to Print a "Data.Table"