Using a loop to create multiple data frames in R
You can save your data.frames into a list by setting up the function as follows:
getstats<- function(games){
listofdfs <- list() #Create a list in which you intend to save your df's.
for(i in 1:length(games)){ #Loop through the numbers of ID's instead of the ID's
#You are going to use games[i] instead of i to get the ID
url<- paste("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&
EndRange=14400&GameID=",games[i],"&RangeType=2&Season=2015-16&SeasonType=
Regular+Season&StartPeriod=1&StartRange=0000",sep = "")
json_data<- fromJSON(paste(readLines(url), collapse=""))
df<- data.frame(json_data$resultSets[1, "rowSet"])
names(df)<-unlist(json_data$resultSets[1,"headers"])
listofdfs[[i]] <- df # save your dataframes into the list
}
return(listofdfs) #Return the list of dataframes.
}
gameids<- as.character(c(0021500580:0021500593))
getstats(games = gameids)
Please note that I could not test this because the URLs do not seem to be working properly. I get the connection error below:
Error in file(con, "r") : cannot open the connection
Creating multiple dataframes in a loop in R
Since you didn't give us any numbers, it is difficult to say exactly what you need the for loop to look for. As such, you will need to sort that out yourself, but here is a basic example of what you could do. The important part that I think you are missing is that you need to use assign
to send the created dataframes to your global environment or wherever you want them to go for that matter. Paste0
is a handy way to give them each their own name. Take note that some of the data frames will be empty. It may be worthwhile to use an if
statement that skips assign
ing the dataframe if (nrow(data3)==0)
.
`Data <- data.frame(matrix(sample(1:10,80,replace = T), nrow = 20, ncol = 4))`
`names(Data) <- c("A","B","C","D")`
`X = c(1:10)`
`for(i in 1:length(X)){
data2 <- Data
data3 <- subset(data2, A == X[i])
assign(paste0("SubsetData",i), data3, envir = .GlobalEnv)
}`
for loop for creating multiple data frames and assigning values
The assign()
function is made for this. See ?assign()
for syntax.
a <- c(1,2,3,4)
b <- c("kk","km","ll","k3")
time <- c(2001,2001,2002,2003)
df <- data.frame(a,b,time)
myvalues <- c(2001,2002,2003)
for (i in 1:3) {
assign(paste0("y",i), df[df$time==myvalues[i],])
}
See here for more ways to achieve this.
How do I create multiple dataframes from a result in a for loop in R?
Don't do it in a loop !! It is done completely different. I'll show you step by step.
My first step is to prepare a function that will generate data similar to yours.
library(tidyverse)
dens = function(year, n) tibble(
PLOT = paste("HI", sample(1:(n/7), n, replace = T)),
SIZE = runif(n, 0.1, 3),
DENSITY = sample(seq(50,200, by=50), n, replace = T),
SEEDYR = year-1,
SAMPYR = year,
AGE = sample(1:5, n, replace = T),
SHOOTS = runif(n, 0.1, 3)
)
Let's see how it works and generate some sample data frames
set.seed(123)
density.2007 = dens(2007, 120)
density.2008 = dens(2008, 88)
density.2009 = dens(2009, 135)
density.2010 = dens(2010, 156)
The density.2007
data frame looks like this
# A tibble: 120 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 110 more rows
Now they need to be combined into one frame
df = density.2007 %>%
bind_rows(density.2008) %>%
bind_rows(density.2009) %>%
bind_rows(density.2010)
output
# A tibble: 499 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 489 more rows
In the next step, count how many times each value of the PLOT
variable occurs
PLOT.count = df %>%
group_by(PLOT) %>%
summarise(PLOT.n = n()) %>%
arrange(PLOT.n)
ouptut
# A tibble: 22 x 2
PLOT PLOT.n
<chr> <int>
1 HI 20 3
2 HI 22 5
3 HI 21 7
4 HI 18 12
5 HI 2 19
6 HI 1 20
7 HI 15 20
8 HI 17 21
9 HI 6 22
10 HI 11 23
# ... with 12 more rows
In the penultimate step, let's append these counters to the original data frame
df = df %>% left_join(PLOT.count, by="PLOT")
output
# A tibble: 499 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 15 1.67 200 2006 2007 4 1.80 20
2 HI 14 0.270 150 2006 2007 2 2.44 32
3 HI 3 0.856 50 2006 2007 3 0.686 27
4 HI 10 1.25 200 2006 2007 5 1.43 25
5 HI 11 0.673 50 2006 2007 5 1.40 23
6 HI 5 2.51 150 2006 2007 3 2.23 38
7 HI 14 0.543 150 2006 2007 2 2.17 32
8 HI 5 2.43 200 2006 2007 5 2.51 38
9 HI 9 1.69 100 2006 2007 4 2.67 26
10 HI 3 2.02 50 2006 2007 2 2.86 27
# ... with 489 more rows
Now filter it at will
df %>% filter(PLOT.n > 30)
ouptut
# A tibble: 139 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 14 0.270 150 2006 2007 2 2.44 32
2 HI 5 2.51 150 2006 2007 3 2.23 38
3 HI 14 0.543 150 2006 2007 2 2.17 32
4 HI 5 2.43 200 2006 2007 5 2.51 38
5 HI 8 0.598 50 2006 2007 1 1.70 34
6 HI 7 1.94 50 2006 2007 4 1.61 35
7 HI 14 2.91 50 2006 2007 4 0.215 32
8 HI 7 0.846 150 2006 2007 4 0.506 35
9 HI 7 2.38 150 2006 2007 3 1.34 35
10 HI 7 2.62 100 2006 2007 3 0.167 35
# ... with 129 more rows
Or this way
df %>% filter(PLOT.n == min(PLOT.n))
df %>% filter(PLOT.n == median(PLOT.n))
df %>% filter(PLOT.n == max(PLOT.n))
output
# A tibble: 3 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 20 0.392 200 2009 2010 1 0.512 3
2 HI 20 0.859 150 2009 2010 5 2.62 3
3 HI 20 0.882 200 2009 2010 5 1.06 3
> df %>% filter(PLOT.n == median(PLOT.n))
# A tibble: 26 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 9 1.69 100 2006 2007 4 2.67 26
2 HI 9 2.20 50 2006 2007 4 1.49 26
3 HI 9 0.587 200 2006 2007 3 1.13 26
4 HI 9 1.27 50 2006 2007 1 2.55 26
5 HI 9 1.56 150 2006 2007 3 2.01 26
6 HI 9 0.198 100 2006 2007 3 2.08 26
7 HI 9 2.72 150 2007 2008 3 0.421 26
8 HI 9 0.251 200 2007 2008 2 0.328 26
9 HI 9 1.83 50 2007 2008 1 0.192 26
10 HI 9 1.97 100 2007 2008 1 0.900 26
# ... with 16 more rows
> df %>% filter(PLOT.n == max(PLOT.n))
# A tibble: 38 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 5 2.51 150 2006 2007 3 2.23 38
2 HI 5 2.43 200 2006 2007 5 2.51 38
3 HI 5 2.06 100 2006 2007 5 1.93 38
4 HI 5 1.25 150 2006 2007 4 2.29 38
5 HI 5 2.29 200 2006 2007 1 2.97 38
6 HI 5 0.789 150 2006 2007 2 1.59 38
7 HI 5 1.11 100 2007 2008 4 2.61 38
8 HI 5 2.38 150 2007 2008 4 2.95 38
9 HI 5 2.67 200 2007 2008 3 1.77 38
10 HI 5 2.63 100 2007 2008 1 1.90 38
# ... with 28 more rows
Filter loop to create multiple data frames
df <- data.frame(topic = c("xxx", "xxx", "yyy", "yyy", "yyy", "zzz", "zzz"),
high = c(52L, 27L, 89L, 99L, 43L, 21L, 90L),
low = c(56L, 98L, 101L, 21L, 98L, 40L, 43L),
stringsAsFactors = FALSE)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
for (variable in unique(df$topic)) {
assign(variable, df %>% filter (topic == variable), envir = .GlobalEnv)
}
xxx
#> topic high low
#> 1 xxx 52 56
#> 2 xxx 27 98
yyy
#> topic high low
#> 1 yyy 89 101
#> 2 yyy 99 21
#> 3 yyy 43 98
zzz
#> topic high low
#> 1 zzz 21 40
#> 2 zzz 90 43
Created on 2019-02-13 by the reprex package (v0.2.1)
R - For loop with dplyr for subsetting data across multiple data frames
This should work if you have a list of dataframes:
library(purrr)
library(dplyr)
All_Stations_30_6_17 <- map_dfr(onomata_list, ~ filter(.x, date == "2017-06-30" & time == "17:00"))
Map functions generally iterate through list elements. Here map_drc
will map the purrr-style function (denoted with the ~
) over the list elements of onomata_list
and then row bind (hence the _dfr
) into a dataframe output.
How to create a column on multiple dataframes R
Try this:
# three toy dataframe
df1 <- data.frame(x = c(1:10))
df2 <- data.frame(x = c(11:20))
df3 <- data.frame(x = c(21:30))
# make a list
dfl <- list(df1, df2, df3)
# add "castigoex" to each df
for (i in 1:length(dfl)) {
dfl[[i]]$castigoex <- runif(10, 0, 1)
}
# I advise you to keep the dataframes in the list,
# but if you want to split them again
list2env(setNames(dfl, paste0("df", seq_along(dfl))), envir = parent.frame())
If you want to avoid for
loop, you can use mapply
new_list <- mapply(function(x) "[<-"(x, "castigoex", value = runif(10, 0, 1)),
dfl, SIMPLIFY = FALSE)
# now you have old list and new list. To split the list (with new name for df)
list2env(setNames(new_list,paste0("df_new", seq_along(new_list))),
envir = parent.frame())
If you want to add more than one column, one possible solution is to use the lapply
function
new_list2 <- lapply(dfl, function(x) cbind(x, castigoex = runif(10, 0, 1),
variable2 = runif(10, 0, 1)))
list2env(setNames(new_list2, paste0("df", seq_along(dfl))), envir = parent.frame())
Loop over multiple data frames with mathematical function
Actually you could do that with by
.
fun <- function(x) cbind(x, newcol1=x[, 2] + x[, 5]*cos(x[, 4]), newcol2=x[, 3] - x[, 5]*sin(x[, 4]))
by(df, df$ofs, fun)
# df$ofs: -50
# Dist X Y deg ofs Z newcol1 newcol2
# 2 20.23 482.3 3578 4.77 -50 19.731 479.421 3528.083
# ---------------------------------------------------------------------------------------------
# df$ofs: -25
# Dist X Y deg ofs Z newcol1 newcol2
# 3 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# ---------------------------------------------------------------------------------------------
# df$ofs: 0
# Dist X Y deg ofs Z newcol1 newcol2
# 1 20.21 499.3 3577 4.77 0 19.75 499.3 3577
# 4 20.23 480.3 3578 4.77 0 19.75 480.3 3578
# ---------------------------------------------------------------------------------------------
# df$ofs: 25
# Dist X Y deg ofs Z newcol1 newcol2
# 5 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# ---------------------------------------------------------------------------------------------
# df$ofs: 50
# Dist X Y deg ofs Z newcol1 newcol2
# 6 20.24 478.3 3578 4.77 50 19.74 481.179 3627.917
If you plan to reassemble it:
do.call(rbind, by(df, df$ofs, fun))
# Dist X Y deg ofs Z newcol1 newcol2
# -50 20.23 482.3 3578 4.77 -50 19.731 479.4210 3528.083
# -25 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# 0.1 20.21 499.3 3577 4.77 0 19.750 499.3000 3577.000
# 0.4 20.23 480.3 3578 4.77 0 19.750 480.3000 3578.000
# 25 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# 50 20.24 478.3 3578 4.77 50 19.740 481.1790 3627.917
Related Topics
Is There a Technical Difference Between "=" and "<-"
How to Rank Within Groups in R
Arrange Plots in a Layout Which Cannot Be Achieved by 'Par(Mfrow ='
How to Get the Text Between Two Words in R
Change the Color of the Axis Labels
How to Directly Perform Write.CSV in R into Tar.Gz Format
Creating a Continuous Heat Map in R
How to Create Design Matrix in R
What Is the Correct Way to Ask for User Input in an R Program
How to Merge Two Data.Table by Different Column Names
Rank Variable by Group (Dplyr)
How to Align a Group of Checkboxgroupinput in R Shiny
Is Data Really Copied Four Times in R's Replacement Functions