Rename Columns by Pattern in R

Rename columns by pattern in R

You can use regular expressions to change the colnames() of an object. Here I'm replacing the Log. with nothing:

colnames(object) <- sub("Log\\.", "", colnames(object))

How to rename the specific pattern in the column name?

How about this?

names(ex_before) <- sub('.*\\.(\\d+)\\.$', '\\1', names(ex_before))
ex_before

# 123 24 532 934
#1 a d 1 9
#2 b e 4 3
#3 c f 2 9

This basically extracts the number which are between periods (".").

How to rename column names based on pattern

You can use regex to capture the digits and add prefix "20".

names(test)[-1] <- sub('(\\d+)/(\\d+)', '20\\1-20\\2', names(test)[-1])

test
# id 2017-2018 2018-2019 2019-2020 2020-2021
#1 500 1 6 4 3
#2 600 4 4 3 5
#3 700 5 3 4 6

Changing the column name based on a partial string or substring

Put the dataframes in a list and use lapply/map to change name of every dataframe. list2env to transfer those changes from the list to individual dataframes.

library(dplyr)
library(purrr)

list_df <- lst(Apple, Mango, Banana, Potato, Tomato)

list_df <- map(list_df,
~.x %>% rename_with(~'Growth', matches('Growth Level Judgement')))

list2env(list_df, .GlobalEnv)

To run it on single dataframe you can do -

Apple %>% rename_with(~'Growth', matches('Growth Level Judgement')))

Or in base R -

names(Apple)[grep('Growth Level Judgement', names(Apple))] <- 'Growth'

Is there a simpler version of renaming columns with alternating patterns? Or tidyverse methods?

If you have a vector x with the names and a vector r with the number of replications, then you could do:

x <- c("v", "c", "u", "p", "z")
r <- c(3L, 3L, 1L, 1L, 3L)

f <- function(n) if (n > 1L) seq_len(n) else character(n)
paste0(rep(x, r), unlist(lapply(r, f)))
## [1] "v1" "v2" "v3" "c1" "c2" "c3" "u" "p" "z1" "z2" "z3"

If you are fine with "u1" and "p1", then you can simplify a bit:

paste0(rep(x, r), unlist(lapply(r, seq_len)))
## [1] "v1" "v2" "v3" "c1" "c2" "c3" "u1" "p1" "z1" "z2" "z3"

There is also base R's make.unique. It is more literate, but it awkwardly only numbers duplicates, so it doesn't quite give you what you want:

make.unique(rep(x, r), sep = "")
## [1] "v" "v1" "v2" "c" "c1" "c2" "u" "p" "z" "z1" "z2"

Rename column names according to pattern matching R

xxxxxx30xxxx <- rep(5,30)
yyyyyyy50yyyyy <- rep(4,30)
zzzzzzz70zzzz <- rep(7,30)
df <- data.frame(zzzzzzz70zzzz,yyyyyyy50yyyyy,xxxxxx30xxxx)

grep(pattern = "[0-100]", x = colnames(df), value= T )

new_colnames <- gsub("\\D", "", colnames(df))
colnames(df) <- new_colnames

I hope i understood you correctly. The gsub command erases everything that is not a digit from the column names, so you're left with the numbers inbetween.

EDIT:

This code matches a two-digit number in your string between 30 and 70, and extracts it.

xxxxxx30xxxx <- rep(5,30)
yyyyyyy50yyyyy <- rep(4,30)
zzzzzzz70zzzz <- rep(7,30)
df <- data.frame(zzzzzzz70zzzz,yyyyyyy50yyyyy,xxxxxx30xxxx)

grep(pattern = "[0-100]", x = colnames(df), value= T )

# new_colnames <- gsub("\\D", "", colnames(df))

new_colnames <- regmatches(colnames(df), regexpr("([3-6][0-9])|([7][0])",colnames(df)))

colnames(df) <- new_colnames

Here's some information on regular expressions and string operations:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html

https://www.regular-expressions.info/rlanguage.html



Related Topics



Leave a reply



Submit