Loop through data frame and variable names
To further add to Beasterfield's answer, it seems like you want to do some number of complex operations on each of the data frames.
It is possible to have complex functions within an apply statement. So where you now have:
for (i in dflist) {
# Do some complex things
}
This can be translated to:
lapply(dflist, function(df) {
# Do some complex operations on each data frame, df
# More steps
# Make sure the last thing is NULL. The last statement within the function will be
# returned to lapply, which will try to combine these as a list across all data frames.
# You don't actually care about this, you just want to run the function.
NULL
})
A more concrete example using plot:
# Assuming we have a data frame with our points on the x, and y axes,
lapply(dflist, function(df) {
x2 <- df$x^2
log_y <- log(df$y)
plot(x,y)
NULL
})
You can also write complex functions which take multiple arguments:
lapply(dflist, function(df, arg1, arg2) {
# Do something on each data.frame, df
# arg1 == 1, arg2 == 2 (see next line)
}, 1, 2) # extra arguments are passed in here
Hope this helps you out!
Loop through dataframe column names - R
To answer the exact question and fix the code given, see the example below
df <- iris # data
for (i in colnames(df)){
print(class(df[[i]]))
}
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "factor"
- you need to used
colnames
to get the column names ofdf
. - you access each column using
df[[i]]
if you want to know the class of that.df[i]
is of classdata.frame
.
Make a dataframe column with names of variables in the for loop
You can leverage on the locals()
function that acts like a dict to hold the local variables:
abc = [1, 2, 3, 4]
def1 = [5, 6, 7, 8] # rename 'def' to 'def1' since 'def' is a Python keyword
df = pd.DataFrame()
for i in ['abc', 'def1']:
df[i] = locals()[i]
Here, locals()['abc']
will resolve to the variable name abc
which holds [1, 2, 3, 4]
. Thus, effectively it is the same as running the following code for the first iteration of the for loop:
df['abc] = [1, 2, 3, 4]
Result:
print(df)
abc def1
0 1 5
1 2 6
2 3 7
3 4 8
Use a variable in a loop to name a data frame in R
You're overwriting write_name
with each evaluation of the loop, so you're probably getting the contents of only the last lake name (Tikitapu). The lazy evaluation used in for
loops probably isn't helping either. One way to fix this is to convert the for
loop to an lapply
:
library(tidyverse)
listOfTibbles <- lapply(
1:8,
function(i) {
name <- Lake_names[i]
read_file <- gsub(" ", "", paste("Data\\", name, "_daily_levels.csv"))
write_name <- gsub(" ", "", paste(name, "_daily_levels"))
# Note the edit to the next line: you need a return value.
as_tibble(read.csv(read_file)) # read file with lake elevations
}
)
lapply
returns a list, each element of the list is the result of applying the function defined in its second argument to the elements of the first argument in turn. So you should get a list of eight tibbles from this code. To combine the list of tibbles into one tibble, you can
oneBigTibble <- listOfTibbles %>% bind_rows()
This is untested code as I don't have access to your CSV files.
Run a loop to generate variable names in Python
Use the inbuilt glob
package
from glob import glob
fullpath = f'C:\Users\siddhn\Desktop\phone[1-6].csv'
dfs = [pd.read_csv(file) for file in glob(fullpath)]
print(dfs[0])
How to iterate over columns of pandas dataframe to run regression
for column in df:
print(df[column])
How to use a for loop to extract columns from a data frame
If you must do it with a for loop, you could work off this:
new <- list() # construct as list -- data.frames are fancy lists
cols <- c(1, 5, 3) # use a vector of column indices
for (i in seq_along(cols)) {
# append the list at each column
new[[i]] <- mtcars[, cols[i], drop = FALSE]
}
new <- as.data.frame(new) # make list into data.frame
identical(new, mtcars[, cols]) # check that this produces the same thing
#> [1] TRUE
head(new)
#> mpg drat disp
#> Mazda RX4 21.0 3.90 160
#> Mazda RX4 Wag 21.0 3.90 160
#> Datsun 710 22.8 3.85 108
#> Hornet 4 Drive 21.4 3.08 258
#> Hornet Sportabout 18.7 3.15 360
#> Valiant 18.1 2.76 225
str(new)
#> 'data.frame': 32 obs. of 3 variables:
#> $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#> $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#> $ disp: num 160 160 108 258 360 ...
Created on 2022-05-20 by the reprex package (v2.0.1)
Edits
With more information, the below should work. However, the for
loops don't seem necessary and the apply
family functions seem good enough. Hopefully if a for
loop is necessary for you process then the combination of these will be enough to get you what you need.
data <- Reduce(
cbind,
lapply(
1:20,
function(i) {
out <- data.frame(
id = order(runif(5)),
event = runif(5) < .5,
other_col = runif(5)
)
colnames(out) <- paste0(colnames(out), i)
out
}
)
)
# just a quick peak
str(data[, c(1:3, 9:12, 21:24)])
#> 'data.frame': 5 obs. of 11 variables:
#> $ id1 : int 3 2 1 4 5
#> $ event1 : logi FALSE FALSE TRUE TRUE FALSE
#> $ other_col1: num 0.617 0.951 0.511 0.185 0.667
#> $ other_col3: num 0.6856 0.0524 0.5786 0.9265 0.2291
#> $ id4 : int 4 2 1 5 3
#> $ event4 : logi TRUE TRUE FALSE FALSE FALSE
#> $ other_col4: num 0.0849 0.8345 0.8465 0.1958 0.2534
#> $ other_col7: num 0.656 0.353 0.604 0.973 0.381
#> $ id8 : int 2 3 5 4 1
#> $ event8 : logi TRUE FALSE FALSE TRUE TRUE
#> $ other_col8: num 0.646 0.693 0.534 0.624 0.625
result <- lapply(1:20, function(i) {
# make pattern (must have letters before number)
pattern <- paste0("[a-z]", i, "$")
# find the column indeces that match the pattern
ind <- grep(pattern, colnames(data))
# extract those indices
res <- data[, ind, ]
# optional: rename columns
colnames(res) <- sub(paste0(i, "$"), "", colnames(res))
res
})
head(result)
#> [[1]]
#> id event other_col
#> 1 3 FALSE 0.6174577
#> 2 2 FALSE 0.9509916
#> 3 1 TRUE 0.5107370
#> 4 4 TRUE 0.1851543
#> 5 5 FALSE 0.6670226
#>
#> [[2]]
#> id event other_col
#> 1 3 TRUE 0.8261719
#> 2 4 FALSE 0.4171351
#> 3 1 TRUE 0.5640345
#> 4 5 TRUE 0.6825371
#> 5 2 FALSE 0.4381013
#>
#> [[3]]
#> id event other_col
#> 1 4 FALSE 0.68559712
#> 2 3 FALSE 0.05241906
#> 3 2 FALSE 0.57857342
#> 4 1 TRUE 0.92649458
#> 5 5 TRUE 0.22908630
#>
#> [[4]]
#> id event other_col
#> 1 4 TRUE 0.08491369
#> 2 2 TRUE 0.83452439
#> 3 1 FALSE 0.84650621
#> 4 5 FALSE 0.19578470
#> 5 3 FALSE 0.25342999
#>
#> [[5]]
#> id event other_col
#> 1 4 FALSE 0.8912857
#> 2 1 FALSE 0.1261470
#> 3 3 FALSE 0.7962369
#> 4 5 TRUE 0.3911494
#> 5 2 FALSE 0.6041862
#>
#> [[6]]
#> id event other_col
#> 1 4 TRUE 0.8987728
#> 2 2 TRUE 0.2830371
#> 3 5 FALSE 0.6696249
#> 4 3 FALSE 0.6249742
#> 5 1 FALSE 0.4754757
Created on 2022-05-22 by the reprex package (v2.0.1)
Loop through variable names in R
Let's pretend that df
is your data and first 15 columns are answers.
In this case you can use this
lapply(df[,1:15], function(x) {chisq.test(x, df$Sex)})
Related Topics
How to Add a Suffix (Or Prefix) Elements of an Existing List
How to Add a Diagonal Line to a Plot
How Does the 'Prop.Table()' Function Work in R
Dynamically Select Data Frame Columns Using $ and a Character Value
Subset Data Frame Based on Number of Rows Per Group
Counting the Number of Elements With the Values of X in a Vector
Reorder Bars in Geom_Bar Ggplot2 by Value
Error: Unexpected Symbol/Input/String Constant/Numeric Constant/Special in My Code
How to Convert a List Consisting of Vector of Different Lengths to a Usable Data Frame in R
Count Number of Occurences For Each Unique Value
Calculate Difference Between Values in Consecutive Rows by Group
Using Ggplot2, How to Insert a Break in the Axis
I Want to Split Street Address into Two Columns. One With Street Number Other With Street Name
Grouping Functions (Tapply, By, Aggregate) and the *Apply Family
Select Rows from a Data Frame Based on Values in a Vector