Iteratively Constructed Dataframe in R

iteratively constructed dataframe in R

Pre-allocate!!!

And use a matrix if the data are all the same type. It will be much faster than a data.frame.

For example:

> n <- 1000      # Number of rows
> row <- 1:20*1 # one row
>
> # Adding row, one-by-one
> Data <- data.frame()
> system.time(for(i in 1:n) Data <- rbind(Data,row))
user system elapsed
2.18 0.00 2.18
>
> # Pre-allocated data.frame
> Data <- as.data.frame(Data)
> system.time(for(i in 1:n) Data[i,] <- row)
user system elapsed
0.94 0.00 0.93
>
> # Pre-allocated matrix (fast!)
> Data <- as.matrix(Data)
> system.time({ for(i in 1:n) Data[i,] <- row; Data <- as.data.frame(Data) })
user system elapsed
0 0 0

R iteratively build a dataframe in a loop

You can use replicate followed by augmenting length to have a constant size, finally convert to dataframe:

myvector <- replicate(10, sample(1:10, sample(1:4)))
myvector <- lapply(myvector, function(x) {
length(x) <- 4
x
})
res <- t(as.data.frame(myvector, row.names = 1:4))
rownames(res) <- 1:10
res

1 2 3 4
1 1 NA NA NA
2 8 3 NA NA
3 3 NA NA NA
4 6 NA NA NA
5 3 10 8 NA
6 7 NA NA NA
7 9 6 NA NA
8 2 NA NA NA
9 8 NA NA NA
10 3 4 NA NA

Create data frame by iteratively adding rows

Using expand.grid:

cmp.lngth <- 2
StartDate <- as.Date("2017-07-14")

set.seed(1)
df1 <- data.frame(expand.grid(Restaurant, seq(cmp.lngth) + StartDate))
colnames(df1) <- c("Restaurant", "Date")
df1$Billings <- rnorm(nrow(df1), mean = 100, sd = 10)
df1 <- df1[ order(df1$Restaurant, df1$Date), ]

df1
# Restaurant Date Billings
# 1 B1 2017-07-15 93.73546
# 5 B1 2017-07-16 103.29508
# 2 B2 2017-07-15 101.83643
# 6 B2 2017-07-16 91.79532
# 3 B3 2017-07-15 91.64371
# 7 B3 2017-07-16 104.87429
# 4 B4 2017-07-15 115.95281
# 8 B4 2017-07-16 107.38325

How to return iteratively updated data frame from function in R?

To somewhat expand on MrFlick's comments:

The issue here is that functions in R perform pass-by-value: df inside df_func is a copy of df (the data.frame with empty columns iter and x) passed to the function. This copy is never modified due to the usage of <<-. Instead, in each iteration of while

df <<- rbind(df, new_df),

which is equivalent to

df <<- rbind(data.frame(iter=integer(), x=integer()), new_df),

modifies df in the global environment, resulting in

> df
iter x
1 10 100

after 10 iterations.

R - loop iteratively through lists and create a dataframe

using base R:

s=aggregate(.~ind,stack(setNames(l3,1:length(l3))),identity)
ind values.1 values.2 values.3
1 1 002e2b45555652749339ab9c34359fb6 2 xx
2 2 002e2b433226527493jsab9c34353fb6 4 zz

if you just need the id's

s$values[,1]
[1] "002e2b45555652749339ab9c34359fb6" "002e2b433226527493jsab9c34353fb6"

How to iteratively add to a data frame in R using a for-loop?

You were very close with your base R approach. You just needed to:

  • remove the assignment by the if. This is because in this base R approach, you're just creating a column and not returning the dataframe.
  • refer to data[["x"]] (or data$x) rather than just x in the i=1 case.

Here's a complete working example:

library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)

data <- data.frame(x = 10:15)

for(i in 1:3) {
x_curnt <- str_c("x_", i)
x_prior <- str_c("x_",i-1)

if(i==1){
data[[x_curnt]] <- data[["x"]] + 1
} else {
data[[x_curnt]] <- data[[x_prior]] + 1
}
}

data
#> x x_1 x_2 x_3
#> 1 10 11 12 13
#> 2 11 12 13 14
#> 3 12 13 14 15
#> 4 13 14 15 16
#> 5 14 15 16 17
#> 6 15 16 17 18

Created on 2022-09-16 by the reprex package (v2.0.1)

How to create new dataframes for each iteration in R for loop?

You can try Map -

m1_m2_list = data.frame(m1_list, m2_list)
result <- Map(filter_data_table, m1_m2_list$m1_list, m1_m2_list$m2_list)

result would have list of dataframes with length same as nrow(m1_m2_list). Each dataframe would be output from one pair of m1_list and m2_list.

Populating a data frame in R in a loop

You could do it like this:

 iterations = 10
variables = 2

output <- matrix(ncol=variables, nrow=iterations)

for(i in 1:iterations){
output[i,] <- runif(2)

}

output

and then turn it into a data.frame

 output <- data.frame(output)
class(output)

what this does:

  1. create a matrix with rows and columns according to the expected growth
  2. insert 2 random numbers into the matrix
  3. convert this into a dataframe after the loop has finished.


Related Topics



Leave a reply



Submit