Changing Column Names in a List of Data Frames in R

Changing Column Names in a List of Data Frames in R

With lapply you can do it as follows.

Create sample data:

df1 <- data.frame(A = 1, B = 2, C = 3)
df2 <- data.frame(X = 1, Y = 2, Z = 3)
dfList <- list(df1,df2)
colnames <- c("USAF","WBAN","YR--MODAHRMN")

Then, lapply over the list using setNames and supply the vector of new column names as second argument to setNames:

lapply(dfList, setNames, colnames)
#[[1]]
# USAF WBAN YR--MODAHRMN
#1 1 2 3
#
#[[2]]
# USAF WBAN YR--MODAHRMN
#1 1 2 3

Edit

If you want to assign the data.frames back to the global environment, you can modify the code like this:

dfList <- list(df1 = df1, df2 = df2)
list2env(lapply(dfList, setNames, colnames), .GlobalEnv)

Change column names of data frames stored in a list by condition

dplyr::rename_with() applies a function to each column name. In that function we can check if the name contains “abund” or “individuals” with grepl() and then those columns get renamed. The columns that don’t contain the strings we are looking for technically also get renamed, but they receive their old name again, so nothing is changed there.

library(dplyr)
library(purrr)

map(lst, ~ rename_with(., ~ ifelse(
grepl("abund|individuals", .), "abundance", .
)))
#> $spiders
#> plot abundance habitat
#> 1 1 1 forest
#> 2 2 4 forest
#> 3 3 8 forest
#>
#> $bugs
#> plot abundance
#> 1 1 1
#> 2 2 4
#> 3 3 8
#>
#> $birds
#> plot abundance habitat
#> 1 1 1 forest
#> 2 2 4 forest
#> 3 3 8 forest
#> 4 1 1 visual
#> 5 2 4 visual
#> 6 3 8 visual

Instead of using tidyverse style anonymous functions we can use the new
base R anonymous function style in order to make the code a bit more comprehensible.

map(lst, \(df) rename_with(df, \(name) ifelse(
grepl("abund|individuals", name), "abundance", name
)))
#> $spiders
#> plot abundance habitat
#> 1 1 1 forest
#> 2 2 4 forest
#> 3 3 8 forest
#>
#> $bugs
#> plot abundance
#> 1 1 1
#> 2 2 4
#> 3 3 8
#>
#> $birds
#> plot abundance habitat
#> 1 1 1 forest
#> 2 2 4 forest
#> 3 3 8 forest
#> 4 1 1 visual
#> 5 2 4 visual
#> 6 3 8 visual

Change column names in list of dataframes

Try this, loop through data.frames, update column name:

# dummy list
my_list <- list(one = data.frame(a = 1:5, b = 1:5), two = data.frame(a = 1:5, b = 1:5))

my_list_updated <-
lapply(names(my_list), function(i){
x <- my_list[[ i ]]
# set 2nd column to a new name
names(x)[2] <- i
# return
x
})

my_list_updated
# [[1]]
# a one
# 1 1 1
# 2 2 2
# 3 3 3
# 4 4 4
# 5 5 5
#
# [[2]]
# a two
# 1 1 1
# 2 2 2
# 3 3 3
# 4 4 4
# 5 5 5

How to change the column names in all my data frames inside a list?

An easier option is set_names if we are changing the names of all the columns

foo1 <- map(foo, set_names, newNames)
foo1
#$testA
# Jan Feb Mar Apr May
#1 -0.2886904 0.7716465 0.7103408795 -0.3209754 0.1580680
#2 0.8776646 0.1441515 1.9820892400 -2.5664872 0.2014593
#3 -1.9172889 1.4930354 -0.0005122859 2.7473145 0.9806701
#4 -0.7642281 -1.7382739 2.8574676114 0.1905533 1.0760523
#5 -0.2753768 0.4712059 -0.8955168101 -0.3923635 1.1017868

#$testB
# Jan Feb Mar Apr May
#1 -1.2544946 -0.2131777 0.634624485 1.5436530 0.5811060
#2 -0.8092116 1.6085164 2.607820897 0.5454936 1.3869741
#3 -0.5460344 0.8028537 -0.007151318 -0.1711816 0.0867885
#4 -0.2104260 -1.3580934 0.835981664 1.3725253 0.0037494
#5 -0.6984177 1.2311613 -0.809374023 -0.2487121 0.8129935

#$testC
# Jan Feb Mar Apr May
#1 0.3667708 -0.01209575 -0.9314844 0.05995604 0.58699473
#2 1.4171330 0.62793554 -0.2695517 2.21667643 0.90599396
#3 1.7093434 -0.98627309 -1.7552439 -0.96652771 -0.05704485
#4 0.2860338 1.34541312 -1.9608085 -1.23959279 0.19175618
#5 -0.9364102 2.47658828 -1.4883768 0.64809561 -0.99417796

or if we use rename_all, make sure to use the ~. According to ?rename_all, the .funs argument would be

.funs - A function fun, a purrr style lambda ~ fun(.) or a list of either form.

foo2 <- map(foo, ~ .x %>%
rename_all(~ newNames))

identical(foo1, foo2)
#[1] TRUE

In the function renameColumn, there are two issues - 1) nothing is returned. 2) argument mismatch - function argument (myData) is different from the one used inside (data)

renameColumn = function(myData, new_names){
colnames(myData) <- new_names
myData
}

map(foo, renameColumn, new_names = newNames)

Using lapply to set column names for a list of data frames?

It seems you want to update the original dataframes. In that case, your list MUST be named. ie check the code below.

List <- list(a = a, b = b, c = c, d = d)
list2env(lapply(List, setNames, nm = headers), globalenv())

Now if you call a you will note that it has been updated.

Using lapply to change column names of a list of data frames

You can also use setNames if you want to replace all columns

df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40)

listDF <- list(df1, df2)
new_col_name <- c("C", "D")

lapply(listDF, setNames, nm = new_col_name)
## [[1]]
## C D
## 1 1 11
## 2 2 12
## 3 3 13
## 4 4 14
## 5 5 15
## 6 6 16
## 7 7 17
## 8 8 18
## 9 9 19
## 10 10 20

## [[2]]
## C D
## 1 21 31
## 2 22 32
## 3 23 33
## 4 24 34
## 5 25 35
## 6 26 36
## 7 27 37
## 8 28 38
## 9 29 39
## 10 30 40

If you need to replace only a subset of column names, then you can use the solution of @Jogo

lapply(listDF, function(df) {
names(df)[-1] <- new_col_name[-ncol(df)]
df
})

A last point, in R there is a difference between a:b - 1 and a:(b - 1)

1:10 - 1
## [1] 0 1 2 3 4 5 6 7 8 9

1:(10 - 1)
## [1] 1 2 3 4 5 6 7 8 9

EDIT

If you want to change the column names of the data.frame in global environment from a list, you can use list2env but I'm not sure it is the best way to achieve want you want. You also need to modify your list and use named list, the name should be the same as name of the data.frame you need to replace.

listDF <- list(df1 = df1, df2 = df2)

new_col_name <- c("C", "D")

listDF <- lapply(listDF, function(df) {
names(df)[-1] <- new_col_name[-ncol(df)]
df
})

list2env(listDF, envir = .GlobalEnv)
str(df1)
## 'data.frame': 10 obs. of 2 variables:
## $ A: int 1 2 3 4 5 6 7 8 9 10
## $ C: int 11 12 13 14 15 16 17 18 19 20

Using lapply to change column names of list of dataframes with different column names

You can create a dataframe with the information of from and to and with lapply use setNames to match and replace the column names :

lookup_names <- data.frame(from = c("A", "B", "D", "E", "G", "H"), 
to = c("this", "that", "he", "she", "him", "her"))

lookup_names
# from to
#1 A this
#2 B that
#3 D he
#4 E she
#5 G him
#6 H her

lapply(my_list, function(x)
setNames(x, lookup_names$to[match(names(x), lookup_names$from)]))

#[[1]]
# this that
#1 1 10
#2 2 11
#3 3 12
#4 4 13
#...
#...

#[[2]]
# he she
#1 20 30
#2 21 31
#3 22 32
#4 23 33
#5 24 34
#....

Change column names for the list of data frame based on the other data frame

You could try something like this for individual list elements, which is based on your attempt:

names(A[[1]]) <- B$`Modified column name`[match(names(A[[1]]), B$Colname)]

To change the names of every list element you can put the above into lapply (I use names<- to avoid having to use return):

clean_B <- lapply(A, function(df){
`names<-`(df, B$`Modified column name`[match(names(df), B$Colname)])
})

Once your data frames all have the same column names you can use do.call with rbind to combine them:

do.call(rbind, clean_B)

#### OUTPUT ####

# A tibble: 15 x 3
ID Reason Name
<dbl> <chr> <chr>
1 1 Event off A
2 2 Event on B
3 3 lock C
4 4 invalid D
5 5 valid E
6 6 Event off A
7 7 Event on B
8 8 lock C
9 9 invalid D
10 10 valid E
11 11 Event off A
12 12 Event on B
13 13 lock C
14 14 invalid D
15 15 valid E

You could also try something like this, which is more succinct, albeit also harder to understand:

library(tidyverse)

map_dfr(A,
~ rename(., !!! `names<-`(B[[1]], B[[2]])[match(names(.), B[[1]])])
)


Related Topics



Leave a reply



Submit