Using a Loop to Create Multiple Data Frames in R

Using a loop to create multiple data frames in R

You can save your data.frames into a list by setting up the function as follows:

getstats<- function(games){

listofdfs <- list() #Create a list in which you intend to save your df's.

for(i in 1:length(games)){ #Loop through the numbers of ID's instead of the ID's

#You are going to use games[i] instead of i to get the ID
url<- paste("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&
EndRange=14400&GameID=",games[i],"&RangeType=2&Season=2015-16&SeasonType=
Regular+Season&StartPeriod=1&StartRange=0000",sep = "")
json_data<- fromJSON(paste(readLines(url), collapse=""))
df<- data.frame(json_data$resultSets[1, "rowSet"])
names(df)<-unlist(json_data$resultSets[1,"headers"])
listofdfs[[i]] <- df # save your dataframes into the list
}

return(listofdfs) #Return the list of dataframes.
}

gameids<- as.character(c(0021500580:0021500593))
getstats(games = gameids)

Please note that I could not test this because the URLs do not seem to be working properly. I get the connection error below:

Error in file(con, "r") : cannot open the connection

Creating multiple dataframes in a loop in R

Since you didn't give us any numbers, it is difficult to say exactly what you need the for loop to look for. As such, you will need to sort that out yourself, but here is a basic example of what you could do. The important part that I think you are missing is that you need to use assign to send the created dataframes to your global environment or wherever you want them to go for that matter. Paste0 is a handy way to give them each their own name. Take note that some of the data frames will be empty. It may be worthwhile to use an if statement that skips assigning the dataframe if (nrow(data3)==0).

`Data <- data.frame(matrix(sample(1:10,80,replace = T), nrow = 20, ncol = 4))`
`names(Data) <- c("A","B","C","D")`
`X = c(1:10)`

`for(i in 1:length(X)){
data2 <- Data
data3 <- subset(data2, A == X[i])
assign(paste0("SubsetData",i), data3, envir = .GlobalEnv)
}`

for loop for creating multiple data frames and assigning values

The assign() function is made for this. See ?assign() for syntax.

a <- c(1,2,3,4)
b <- c("kk","km","ll","k3")
time <- c(2001,2001,2002,2003)
df <- data.frame(a,b,time)
myvalues <- c(2001,2002,2003)

for (i in 1:3) {
assign(paste0("y",i), df[df$time==myvalues[i],])
}

See here for more ways to achieve this.

How do I create multiple dataframes from a result in a for loop in R?

Don't do it in a loop !! It is done completely different. I'll show you step by step.
My first step is to prepare a function that will generate data similar to yours.

library(tidyverse)

dens = function(year, n) tibble(
PLOT = paste("HI", sample(1:(n/7), n, replace = T)),
SIZE = runif(n, 0.1, 3),
DENSITY = sample(seq(50,200, by=50), n, replace = T),
SEEDYR = year-1,
SAMPYR = year,
AGE = sample(1:5, n, replace = T),
SHOOTS = runif(n, 0.1, 3)
)

Let's see how it works and generate some sample data frames

set.seed(123)
density.2007 = dens(2007, 120)
density.2008 = dens(2008, 88)
density.2009 = dens(2009, 135)
density.2010 = dens(2010, 156)

The density.2007 data frame looks like this

# A tibble: 120 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 110 more rows

Now they need to be combined into one frame

df = density.2007 %>% 
bind_rows(density.2008) %>%
bind_rows(density.2009) %>%
bind_rows(density.2010)

output

# A tibble: 499 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 489 more rows

In the next step, count how many times each value of the PLOT variable occurs

PLOT.count = df %>% 
group_by(PLOT) %>%
summarise(PLOT.n = n()) %>%
arrange(PLOT.n)

ouptut

# A tibble: 22 x 2
PLOT PLOT.n
<chr> <int>
1 HI 20 3
2 HI 22 5
3 HI 21 7
4 HI 18 12
5 HI 2 19
6 HI 1 20
7 HI 15 20
8 HI 17 21
9 HI 6 22
10 HI 11 23
# ... with 12 more rows

In the penultimate step, let's append these counters to the original data frame

df = df %>% left_join(PLOT.count, by="PLOT")

output

# A tibble: 499 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 15 1.67 200 2006 2007 4 1.80 20
2 HI 14 0.270 150 2006 2007 2 2.44 32
3 HI 3 0.856 50 2006 2007 3 0.686 27
4 HI 10 1.25 200 2006 2007 5 1.43 25
5 HI 11 0.673 50 2006 2007 5 1.40 23
6 HI 5 2.51 150 2006 2007 3 2.23 38
7 HI 14 0.543 150 2006 2007 2 2.17 32
8 HI 5 2.43 200 2006 2007 5 2.51 38
9 HI 9 1.69 100 2006 2007 4 2.67 26
10 HI 3 2.02 50 2006 2007 2 2.86 27
# ... with 489 more rows

Now filter it at will

df %>% filter(PLOT.n > 30)

ouptut

# A tibble: 139 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 14 0.270 150 2006 2007 2 2.44 32
2 HI 5 2.51 150 2006 2007 3 2.23 38
3 HI 14 0.543 150 2006 2007 2 2.17 32
4 HI 5 2.43 200 2006 2007 5 2.51 38
5 HI 8 0.598 50 2006 2007 1 1.70 34
6 HI 7 1.94 50 2006 2007 4 1.61 35
7 HI 14 2.91 50 2006 2007 4 0.215 32
8 HI 7 0.846 150 2006 2007 4 0.506 35
9 HI 7 2.38 150 2006 2007 3 1.34 35
10 HI 7 2.62 100 2006 2007 3 0.167 35
# ... with 129 more rows

Or this way

df %>% filter(PLOT.n == min(PLOT.n))
df %>% filter(PLOT.n == median(PLOT.n))
df %>% filter(PLOT.n == max(PLOT.n))

output

# A tibble: 3 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 20 0.392 200 2009 2010 1 0.512 3
2 HI 20 0.859 150 2009 2010 5 2.62 3
3 HI 20 0.882 200 2009 2010 5 1.06 3
> df %>% filter(PLOT.n == median(PLOT.n))
# A tibble: 26 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 9 1.69 100 2006 2007 4 2.67 26
2 HI 9 2.20 50 2006 2007 4 1.49 26
3 HI 9 0.587 200 2006 2007 3 1.13 26
4 HI 9 1.27 50 2006 2007 1 2.55 26
5 HI 9 1.56 150 2006 2007 3 2.01 26
6 HI 9 0.198 100 2006 2007 3 2.08 26
7 HI 9 2.72 150 2007 2008 3 0.421 26
8 HI 9 0.251 200 2007 2008 2 0.328 26
9 HI 9 1.83 50 2007 2008 1 0.192 26
10 HI 9 1.97 100 2007 2008 1 0.900 26
# ... with 16 more rows
> df %>% filter(PLOT.n == max(PLOT.n))
# A tibble: 38 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 5 2.51 150 2006 2007 3 2.23 38
2 HI 5 2.43 200 2006 2007 5 2.51 38
3 HI 5 2.06 100 2006 2007 5 1.93 38
4 HI 5 1.25 150 2006 2007 4 2.29 38
5 HI 5 2.29 200 2006 2007 1 2.97 38
6 HI 5 0.789 150 2006 2007 2 1.59 38
7 HI 5 1.11 100 2007 2008 4 2.61 38
8 HI 5 2.38 150 2007 2008 4 2.95 38
9 HI 5 2.67 200 2007 2008 3 1.77 38
10 HI 5 2.63 100 2007 2008 1 1.90 38
# ... with 28 more rows

Filter loop to create multiple data frames

df <- data.frame(topic = c("xxx", "xxx", "yyy", "yyy", "yyy", "zzz", "zzz"), 
high = c(52L, 27L, 89L, 99L, 43L, 21L, 90L),
low = c(56L, 98L, 101L, 21L, 98L, 40L, 43L),
stringsAsFactors = FALSE)

library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union

for (variable in unique(df$topic)) {
assign(variable, df %>% filter (topic == variable), envir = .GlobalEnv)
}

xxx
#> topic high low
#> 1 xxx 52 56
#> 2 xxx 27 98

yyy
#> topic high low
#> 1 yyy 89 101
#> 2 yyy 99 21
#> 3 yyy 43 98

zzz
#> topic high low
#> 1 zzz 21 40
#> 2 zzz 90 43

Created on 2019-02-13 by the reprex package (v0.2.1)

R - For loop with dplyr for subsetting data across multiple data frames

This should work if you have a list of dataframes:

library(purrr)
library(dplyr)

All_Stations_30_6_17 <- map_dfr(onomata_list, ~ filter(.x, date == "2017-06-30" & time == "17:00"))

Map functions generally iterate through list elements. Here map_drc will map the purrr-style function (denoted with the ~) over the list elements of onomata_list and then row bind (hence the _dfr) into a dataframe output.

How to create a column on multiple dataframes R

Try this:

# three toy dataframe
df1 <- data.frame(x = c(1:10))
df2 <- data.frame(x = c(11:20))
df3 <- data.frame(x = c(21:30))

# make a list
dfl <- list(df1, df2, df3)

# add "castigoex" to each df
for (i in 1:length(dfl)) {
dfl[[i]]$castigoex <- runif(10, 0, 1)
}

# I advise you to keep the dataframes in the list,
# but if you want to split them again
list2env(setNames(dfl, paste0("df", seq_along(dfl))), envir = parent.frame())

If you want to avoid for loop, you can use mapply

new_list <- mapply(function(x) "[<-"(x, "castigoex", value = runif(10, 0, 1)),
dfl, SIMPLIFY = FALSE)

# now you have old list and new list. To split the list (with new name for df)
list2env(setNames(new_list,paste0("df_new", seq_along(new_list))),
envir = parent.frame())

If you want to add more than one column, one possible solution is to use the lapply function

new_list2 <- lapply(dfl, function(x) cbind(x, castigoex = runif(10, 0, 1), 
variable2 = runif(10, 0, 1)))

list2env(setNames(new_list2, paste0("df", seq_along(dfl))), envir = parent.frame())

Loop over multiple data frames with mathematical function

Actually you could do that with by.

fun <- function(x) cbind(x, newcol1=x[, 2] + x[, 5]*cos(x[, 4]), newcol2=x[, 3] - x[, 5]*sin(x[, 4]))

by(df, df$ofs, fun)
# df$ofs: -50
# Dist X Y deg ofs Z newcol1 newcol2
# 2 20.23 482.3 3578 4.77 -50 19.731 479.421 3528.083
# ---------------------------------------------------------------------------------------------
# df$ofs: -25
# Dist X Y deg ofs Z newcol1 newcol2
# 3 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# ---------------------------------------------------------------------------------------------
# df$ofs: 0
# Dist X Y deg ofs Z newcol1 newcol2
# 1 20.21 499.3 3577 4.77 0 19.75 499.3 3577
# 4 20.23 480.3 3578 4.77 0 19.75 480.3 3578
# ---------------------------------------------------------------------------------------------
# df$ofs: 25
# Dist X Y deg ofs Z newcol1 newcol2
# 5 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# ---------------------------------------------------------------------------------------------
# df$ofs: 50
# Dist X Y deg ofs Z newcol1 newcol2
# 6 20.24 478.3 3578 4.77 50 19.74 481.179 3627.917

If you plan to reassemble it:

do.call(rbind, by(df, df$ofs, fun))
# Dist X Y deg ofs Z newcol1 newcol2
# -50 20.23 482.3 3578 4.77 -50 19.731 479.4210 3528.083
# -25 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# 0.1 20.21 499.3 3577 4.77 0 19.750 499.3000 3577.000
# 0.4 20.23 480.3 3578 4.77 0 19.750 480.3000 3578.000
# 25 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# 50 20.24 478.3 3578 4.77 50 19.740 481.1790 3627.917


Related Topics



Leave a reply



Submit