iteratively constructed dataframe in R
Pre-allocate!!!
And use a matrix
if the data are all the same type. It will be much faster than a data.frame
.
For example:
> n <- 1000 # Number of rows
> row <- 1:20*1 # one row
>
> # Adding row, one-by-one
> Data <- data.frame()
> system.time(for(i in 1:n) Data <- rbind(Data,row))
user system elapsed
2.18 0.00 2.18
>
> # Pre-allocated data.frame
> Data <- as.data.frame(Data)
> system.time(for(i in 1:n) Data[i,] <- row)
user system elapsed
0.94 0.00 0.93
>
> # Pre-allocated matrix (fast!)
> Data <- as.matrix(Data)
> system.time({ for(i in 1:n) Data[i,] <- row; Data <- as.data.frame(Data) })
user system elapsed
0 0 0
R iteratively build a dataframe in a loop
You can use replicate
followed by augmenting length
to have a constant size, finally convert to dataframe
:
myvector <- replicate(10, sample(1:10, sample(1:4)))
myvector <- lapply(myvector, function(x) {
length(x) <- 4
x
})
res <- t(as.data.frame(myvector, row.names = 1:4))
rownames(res) <- 1:10
res
1 2 3 4
1 1 NA NA NA
2 8 3 NA NA
3 3 NA NA NA
4 6 NA NA NA
5 3 10 8 NA
6 7 NA NA NA
7 9 6 NA NA
8 2 NA NA NA
9 8 NA NA NA
10 3 4 NA NA
Create data frame by iteratively adding rows
Using expand.grid
:
cmp.lngth <- 2
StartDate <- as.Date("2017-07-14")
set.seed(1)
df1 <- data.frame(expand.grid(Restaurant, seq(cmp.lngth) + StartDate))
colnames(df1) <- c("Restaurant", "Date")
df1$Billings <- rnorm(nrow(df1), mean = 100, sd = 10)
df1 <- df1[ order(df1$Restaurant, df1$Date), ]
df1
# Restaurant Date Billings
# 1 B1 2017-07-15 93.73546
# 5 B1 2017-07-16 103.29508
# 2 B2 2017-07-15 101.83643
# 6 B2 2017-07-16 91.79532
# 3 B3 2017-07-15 91.64371
# 7 B3 2017-07-16 104.87429
# 4 B4 2017-07-15 115.95281
# 8 B4 2017-07-16 107.38325
How to return iteratively updated data frame from function in R?
To somewhat expand on MrFlick's comments:
The issue here is that functions in R perform pass-by-value: df
inside df_func
is a copy of df
(the data.frame
with empty columns iter
and x
) passed to the function. This copy is never modified due to the usage of <<-
. Instead, in each iteration of while
df <<- rbind(df, new_df)
,
which is equivalent to
df <<- rbind(data.frame(iter=integer(), x=integer()), new_df)
,
modifies df
in the global environment, resulting in
> df
iter x
1 10 100
after 10 iterations.
R - loop iteratively through lists and create a dataframe
using base R:
s=aggregate(.~ind,stack(setNames(l3,1:length(l3))),identity)
ind values.1 values.2 values.3
1 1 002e2b45555652749339ab9c34359fb6 2 xx
2 2 002e2b433226527493jsab9c34353fb6 4 zz
if you just need the id's
s$values[,1]
[1] "002e2b45555652749339ab9c34359fb6" "002e2b433226527493jsab9c34353fb6"
How to iteratively add to a data frame in R using a for-loop?
You were very close with your base R approach. You just needed to:
- remove the assignment by the
if
. This is because in this base R approach, you're just creating a column and not returning the dataframe. - refer to
data[["x"]]
(ordata$x
) rather than justx
in thei=1
case.
Here's a complete working example:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
data <- data.frame(x = 10:15)
for(i in 1:3) {
x_curnt <- str_c("x_", i)
x_prior <- str_c("x_",i-1)
if(i==1){
data[[x_curnt]] <- data[["x"]] + 1
} else {
data[[x_curnt]] <- data[[x_prior]] + 1
}
}
data
#> x x_1 x_2 x_3
#> 1 10 11 12 13
#> 2 11 12 13 14
#> 3 12 13 14 15
#> 4 13 14 15 16
#> 5 14 15 16 17
#> 6 15 16 17 18
Created on 2022-09-16 by the reprex package (v2.0.1)
How to create new dataframes for each iteration in R for loop?
You can try Map
-
m1_m2_list = data.frame(m1_list, m2_list)
result <- Map(filter_data_table, m1_m2_list$m1_list, m1_m2_list$m2_list)
result
would have list of dataframes with length same as nrow(m1_m2_list)
. Each dataframe would be output from one pair of m1_list
and m2_list
.
Populating a data frame in R in a loop
You could do it like this:
iterations = 10
variables = 2
output <- matrix(ncol=variables, nrow=iterations)
for(i in 1:iterations){
output[i,] <- runif(2)
}
output
and then turn it into a data.frame
output <- data.frame(output)
class(output)
what this does:
- create a matrix with rows and columns according to the expected growth
- insert 2 random numbers into the matrix
- convert this into a dataframe after the loop has finished.
Related Topics
Error in Get(As.Character(Fun), Mode = "Function", Envir = Envir)
How to Access Browser Session/Cookies from Within Shiny App
How to Extract Multiples of a Number from a Vector
Select Columns by Class (E.G. Numeric) from a Data.Table
R: Building a Simple Command Line Plotting Tool/Capturing Window Close Events
How to Sort Data by Column in Descending Order in R
Ggplot2: More Complex Faceting
Error While Using Install_Github | Devtools | Timeout Issue
Unique.Data.Table Select Last Row in Place of the First
Extract Name of Data.Frame in R as Character
Retain Attributes When Using Gather from Tidyr (Attributes Are Not Identical)
Solving a System of Nonlinear Equations in R
Multi Line Title in Ggplot 2 with Multiple Italicized Words
How to Build a Crossword-Like Plot for a Boolean Matrix
R Shiny - Uioutput Not Rendering Inside Menuitem