How to change column names while in a loop in R?
This the approach I would take. The following script not only changes the column names, but also creates 3 dataframes in the global environment kind of like your original script.
for (i in 1:3){
noms <- c("n1","n2","n3") # create the names in order the columns appear in the dataframe
df_ <- data.frame(matrix("", nrow = 3, ncol = 3)) # create the dataframe
df_nom <- paste("mydf", i, sep = "") # create the dataframe name
colnames(df_) <- noms # assign the names to the columns
assign(df_nom, df_) # rename the dataframe
}
loop for renaming columns in r
#sample data
set.seed(1)
df <- data.frame(id=1:4, replicate(5,sample(0:1,4,rep=TRUE)))
#define a list of varying "varname"
varname <- c('OR', 'FA')
#define how many times above "varname" repeat itself
n <- c(2, 3) #let's say that 'OR' repeats 2 times and 'FA' 3 times
#replace column name
names(df)[2:ncol(df)] <- unlist(mapply(function(x,y) paste(x, seq(1,y), sep="_"), varname, n))
Output is:
id OR_1 OR_2 FA_1 FA_2 FA_3
1 1 0 0 1 1 1
2 2 0 1 0 0 1
3 3 1 1 0 1 0
4 4 1 1 0 0 1
Changing Column names of multiple data frames using a for loop with data frames loaded into a List
L <- lapply(L, function(x){
colnames(x) <- c("NewName1", "NewName2")
x
} )
Loop over a list of dataframes and change column names in R
dflist <- list(df,df2)
for (i in 1:length(dflist)) {
if(any(names(dflist[[i]]) == remNames)){
colnames(dflist[[i]]) <- dflist[[i]][1,]
dflist[[i]] = dflist[[i]][-1, ]
}
}
dflist[[i]][names(dflist[[i]])] == remNames
will check the enitre dataframe, hence if
will return FALSE and nothing happend, consider the following example when i=2
> i=2
> dflist[[i]][names(dflist[[i]])] == remNames
X2 X..X1
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
A better solution is to use grepl
to see if the column names contain a ..
or X
, so the if becomes
if(any(grepl('\\.\\.|X',names(dflist[[i]])))){...}
As to data.frame,how to loop column name first and then loop row name of column?
We loop over the columns of dataset with sapply
, get the index of non-NA elements, use that to subset the row.names
, paste
the elements by collapse
ing with +
and paste
with the column names of 'df' with the output of sapply
paste(names(df), sapply(df, function(x)
paste(row.names(df)[which(!is.na(x))], collapse="+")), sep="=")
#[1] "cola=1+2+4" "colb=1+4" "colc=2+3"
Or with which/arr.ind
i1 <- which(!is.na(df), arr.ind = TRUE)
paste(names(df), tapply(row.names(df)[i1[,1]], i1[,2],
FUN = paste, collapse="+"), sep="=")
#[1] "cola=1+2+4" "colb=1+4" "colc=2+3"
Or with imap
library(purrr)
library(stringr)
unname(imap_chr(df, ~ str_c(.y, "=",
str_c(row.names(df)[!is.na(.x)], collapse='+'))))
#[1] "cola=1+2+4" "colb=1+4" "colc=2+3"
Change column names for multiple data frames in a loop
If you want to go this way you probably need the function assign
. As an example:
rivers <- c("df_Main", "df_Danube", "df_Isar", "df_Inn")
for (i in rivers) {
x=get(i)
colnames(x) <- c("bla", "bla", "bla", "bla")
assign(i,x)
}
If you need to do it for more than 4 data.frames maybe you should check an apply function.
Still, if you plan to plot it via ggplot2 it might be more useful to have it in a single df.
Loop through dataframe column names - R
To answer the exact question and fix the code given, see the example below
df <- iris # data
for (i in colnames(df)){
print(class(df[[i]]))
}
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "factor"
- you need to used
colnames
to get the column names ofdf
. - you access each column using
df[[i]]
if you want to know the class of that.df[i]
is of classdata.frame
.
Related Topics
How to Find the Closest Date to a Given Date
Add Legend to Ggplot2 Line Plot
Pass a Data.Frame Column Name to a Function
Relative Frequencies/Proportions With Dplyr
Determine Path of the Executing Script
Adding a Column of Means by Group to Original Data
Convert Row Names into First Column
Generate List of All Possible Combinations of Elements of Vector
How to Add a Suffix (Or Prefix) Elements of an Existing List
Column Name Changes in R for Loop for Defined Data Frame
How to Find the Statistical Mode
Error in If/While (Condition) {: Missing Value Where True/False Needed
Replace Missing Values (Na) With Most Recent Non-Na by Group