Populating a data frame in R in a loop
You could do it like this:
iterations = 10
variables = 2
output <- matrix(ncol=variables, nrow=iterations)
for(i in 1:iterations){
output[i,] <- runif(2)
}
output
and then turn it into a data.frame
output <- data.frame(output)
class(output)
what this does:
- create a matrix with rows and columns according to the expected growth
- insert 2 random numbers into the matrix
- convert this into a dataframe after the loop has finished.
Loop to dynamically fill dataframe R
Dynamically filling an object using a for loop is fine - what causes problems is when you dynamically build an object using a for loop (e.g. using cbind
and rbind
rows).
When you build something dynamically, R has to go and request new memory for the object in each loop, because it keeps increasing in size. This causes a for loop to slow down with every iteration as the object gets bigger.
When you create the object beforehand (e.g. a data.frame
with the right number of rows and columns), and fill it in by index, the for loop doesn't have this problem.
One final thing to keep in mind is that for data.frames
(and matrices
) each column is stored as a vector in memory – so its usually more efficient to fill these in one column at a time.
With all that in mind we can revise your code as follows:
results <- data.frame(matrix(NA, nrow = length(seq(1:10)),
ncol = length(seq(1:10))))
for (rowIdx in 1:nrow(results)) {
for (colIdx in 1:ncol(results)) {
results[rowIdx, colIdx] <- 5 # or whatever value you want here
}
}
Loop through data frame and match/populate rows with column values
library(dplyr)
df1 <- data.frame(
MON = c(1,2,3),
TUE = c(5,6,7),
WED = c(8,9,10),
THU = c(11,12,13),
FRI = c(14,15,16),
SAT = c(17,18,19),
SUN = c(20,21,22))
df2 <- data.frame(
Day = c('THU', 'FRI', 'SAT', 'SUN', 'MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN', 'MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN'),
Hours = 0
)
Example df1
: (sorry, I didn't take the time to recreate you exact data, please follow through)
MON TUE WED THU FRI SAT SUN
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 5 8 11 14 17 20
2 2 6 9 12 15 18 21
3 3 7 10 13 16 19 22
Example df2
:
Day Hours
<chr> <dbl>
1 THU 0
2 FRI 0
3 SAT 0
4 SUN 0
5 MON 0
6 TUE 0
7 WED 0
8 THU 0
9 FRI 0
10 SAT 0
11 SUN 0
12 MON 0
13 TUE 0
14 WED 0
15 THU 0
16 FRI 0
17 SAT 0
18 SUN 0
Step 1: This should be the algorithm you`re looking for to sort df2 into df1 in the way you described it.
row_df2 <- 1
for (row_df1 in seq(1,nrow(df1))) {
for (day in c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN'))
if (df2[row_df2, 'Day'] == day) {
df2[row_df2,'Hours'] <- df1[row_df1,day]
row_df2 <- row_df2 + 1
}
}
Step 2: now you could sum up the values in df1, e.g. using dplyr
:
df1 <- df1 %>%
mutate(
Sum = MON + TUE + WED + THU + FRI + SAT + SUN
)
df1:
# A tibble: 3 x 8
MON TUE WED THU FRI SAT SUN Sum
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 5 8 11 14 17 20 76
2 2 6 9 12 15 18 21 83
3 3 7 10 13 16 19 22 90
df2:
# A tibble: 18 x 2
Day Hours
<chr> <dbl>
1 THU 11 <- row 1: THU
2 FRI 14 <- row 1: FRI
3 SAT 17 <- ...
4 SUN 20
5 MON 2 <- row 2: MON
6 TUE 6 <- ....
7 WED 9
8 THU 12
9 FRI 15
10 SAT 18
11 SUN 21 <- row 2: SUN
12 MON 3 <- row 3: MON
13 TUE 7
14 WED 10
15 THU 13
16 FRI 16
17 SAT 19
18 SUN 22
Is there no identifier like Date
in both tables? This would make it much more robust. You could then match by date without relying on the right day to start with.
Edit 1: Updated after testing and removal of some errors.
Edit 2: Highlighted which value from df1
will land in df2
. I just used different example data than you (I didn't want to type it all in).
Edit 3: Used data.frame
instead of tibble
in example data to demonstrate it should work as well.
Edit 4: Is this what you want?
row_df1 <- 1
row_df2 <- 1
for (row_df2 in seq(1,nrow(df2))) {
for (day in week) {
if (df2[row_df2, 'Day'] == day) {
df2[row_df2,'Hours'] <- df1[row_df1,day]
row_df2 <- row_df2 + 1
}
}
df2
will lead to:
Day Hours
1 THU 11 <- row 1: THU
2 FRI 14
3 SAT 17
4 SUN 20
5 MON 1
6 TUE 5
7 WED 8
8 THU 11 <- row 1: THU
9 FRI 14
10 SAT 17
11 SUN 20
12 MON 1
13 TUE 5
14 WED 8
15 THU 11 <- row 1: THU
16 FRI 14
17 SAT 17
18 SUN 20
Edit 5: Seems there is a {
missing:
for (row_df2 in seq(1,nrow(Calendar$Jan))) {
for (day in week) { # <- HERE
if (Calendar$Jan[row_df2, 'Day'] == day) {
Calendar$Jan[row_df2,'Hours'] <- Calctable[row_df1,day]
row_df2 <- row_df2 + 1
}
}
Edit 6:
In Edit 5 I assigned week <- c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN')
but forgot to mention it. It should have looked like (no special built-in variables here):
week <- c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN')
for (row_df2 in seq(1,nrow(Calendar$Jan))) {
for (day in week) {
if (Calendar$Jan[row_df2, 'Day'] == day) {
Calendar$Jan[row_df2,'Hours'] <- Calctable[row_df1,day]
row_df2 <- row_df2 + 1
}
}
}
in case you re-use week
at some other point in your code. I used it for testing the loop and mixed it up in the previous version of this answer.
Using an if else loop to populate a dataframe in R
A solution using loops and a lookup list.
First store the cut breaks and labels for each code in a list.
tmp=list(
"21"=list(
"brk"=c(0,0.01,0.0375,0.0725,0.1,1),
"lab"=0:4
),
"24"=list(
"brk"=c(0,0.01,0.0375,0.0725,0.1,1),
"lab"=4:0
)
)
Then loop over the columns of interest and for each code apply the cut function.
for(cc in c("oP.Res","TP.Res")) {
Merged[paste0(cc,"_cut")]=NA
for (ctg in unique(Merged$MEAS_ANAL_METH_CODE)) {
Merged[Merged$MEAS_ANAL_METH_CODE==ctg,paste0(cc,"_cut")]=
as.character(
cut(
Merged[Merged$MEAS_ANAL_METH_CODE==ctg,cc],
tmp[[as.character(ctg)]][["brk"]],
tmp[[as.character(ctg)]][["lab"]]
)
)
}
}
for loop to populate dataframe
Since you are already using dplyr
, it is easy to also use purrr
to merge the data.frames for you
library(purrr)
map_df(start.year:end.year, function(year) {
mat <- df %>%
filter(Level == "Grad" & EntryYear <= year & ExitYear >= year) %>%
distinct(RA) %>%
summarise(year= n())
})
Writing a for loop with the output as a data frame in R
As this is a learning question I will not provide the solution directly.
> values <- c(-10,0,10,100)
> for (i in seq_along(values)) {print(i)} # Checking we iterate by position
[1] 1
[1] 2
[1] 3
[1] 4
> output <- vector("double", 10)
> output # Checking the place where the output will be
[1] 0 0 0 0 0 0 0 0 0 0
> for (i in seq_along(values)) { # Testing the full code
+ output[[i]] <- rnorm(10, mean = values[[i]])
+ }
Error in output[[i]] <- rnorm(10, mean = values[[i]]) :
more elements supplied than there are to replace
As you can see the error say there are more elements to put than space (each iteration generates 10 random numbers, (in total 40) and you only have 10 spaces. Consider using a data format that allows to store several values for each iteration.
So that:
> output <- ??
> for (i in seq_along(values)) { # Testing the full code
+ output[[i]] <- rnorm(10, mean = values[[i]])
+ }
> output # Should have length 4 and each element all the 10 values you created in the loop
Related Topics
Counting Number of Instances of a Condition Per Row R
How to Coerce a List Object to Type 'Double'
Use Filter in Dplyr Conditional on an If Statement in R
How to Change Order of Array Dimensions
Update Handsontable by Editing Table And/Or Eventreactive
Change Day of the Month in a Date to First Day (01)
Scraping with Rvest - Complete with Nas When Tag Is Not Present
Extract Text After "/" in a Data Frame Column
How to Stack Error Bars in a Stacked Bar Plot Using Geom_Errorbar
Multiple Ggplots of Different Sizes
Control the Height in Fluidrow in R Shiny
How to Make Variable Bar Widths in Ggplot2 Not Overlap or Gap
Rolling Sum by Another Variable in R