How to Change Name of Factor Levels

How to change name of factor levels?

It is very easy to change the factor levels and also not get confused about which is which:

Example data:

> a <- factor(rep(c(1,2,1),50))
> a
[1] 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2
[75] 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1
[149] 2 1
Levels: 1 2

#this will help later as a verification
#this counts the instances for 1 and 2
> table(a)
a
1 2
100 50

So as you can see above the order of the levels is 1 first and 2 second. When you change the levels (below) the order remains the same:

#the assignment function levels can be used to change the levels
#the order will remain the same i.e. 'c' for '1' and 'not-c' for '2'
levels(a) <- c('c', 'not-c')

> a
[1] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[25] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[49] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[73] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[97] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[121] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[145] c not-c c c not-c c
Levels: c not-c

And this is the verification:

> table(a)
a
c not-c
100 50

Renaming factor levels without referring to the factor name

You can change it with levels<-.

df <- data.frame(x = c("gravel", "sandstone", "siltstone"), stringsAsFactors = TRUE)
levels(df$x)
#[1] "gravel" "sandstone" "siltstone"

levels(df$x) <- c('G', 'S1', 'S2')
levels(df$x)
#[1] "G" "S1" "S2"

df
# x
#1 G
#2 S1
#3 S2

R - How to rename factor levels using a function?

You can set the levels in the following way

x <- as.factor(head(letters))
x
# [1] a b c d e f
# Levels: a b c d e f

levels(x) <- toupper(levels(x))
x
# [1] A B C D E F
# Levels: A B C D E F

Rename factor levels in a data.frame in R

Do this

levels(df$x) <- paste0("R1_",levels(df$x))
# df
# x
# 1 R1_gravel
# 2 R1_sandstone
# 3 R1_siltstone

Rename all levels of factor variable, is there a tidyverse way to do it?

You can easily do that with a named vector and forcats::fct_recode():

library(tidyverse)
set.seed(42)

my_levels <- letters
sample_data <- data.frame(factor_data = factor(sample(my_levels,size = 500,replace = T) ,
levels = my_levels),
Any_other_data = rnorm(500))

my_new_levels <- rnorm(length(letters))

# create a named vector with the new levels
named_level_vector <- levels(sample_data$factor_data)
names(named_level_vector) <- my_new_levels

# use mutate and fct_recode with that vector

sample_data <- sample_data %>%
mutate(new_factor_data = forcats::fct_recode(factor_data, !!!named_level_vector))

head(sample_data)
#> factor_data Any_other_data new_factor_data
#> 1 q 0.48236947 0.223521215874458
#> 2 e 0.99294364 -1.12828853519737
#> 3 a -1.24639550 -2.55382485095083
#> 4 y -0.03348752 1.67099730539817
#> 5 j -0.07096218 -0.318990710826149
#> 6 d -0.75892065 -1.17990419995829

Created on 2020-06-11 by the reprex package (v0.3.0)

Errors when changing level names in a factor

The problem is that the levels aren't named, so you can't reference them that way. If you want a safe way to manipulate factor levels, look at the forcats package (part of tidyverse). That has the function fct_recode which will do what you want.

library(forcats)
df$varname <- fct_recode(df$varname, Levelname = "Very long text that needs to be shortened")

Rename levels of a factor in tbl_regression

You can use modify_table_body() to change the labels of the levels

 glm(response ~ trt + factor(death), data = trial) %>%
tbl_regression(
label = list(
trt ~ "Drug B vs A",
`factor(death)` ~ "Death" )
) %>%
modify_table_body(
~.x %>%
mutate(label = ifelse(label == "0", "Alive",
ifelse(label =="1", "Dead",label)))
)

if you want to be more cautious about the labels you change you can add in another condition to the ifelse() statement:

glm(response ~ trt + factor(death), data = trial) %>%
tbl_regression(
label = list(
trt ~ "Drug B vs A",
`factor(death)` ~ "Death" )
) %>%
modify_table_body(
~.x %>%
mutate(label = ifelse(label == "0" & variable == "factor(death)", "Alive",
ifelse(label =="1" & variable == "factor(death)", "Dead",label)))
)

How to change the level names of a set of factor?

We can use data.table methods as the object is a data.table

library(data.table)
df[, F1 := factor((F1 %/% 2)+1)]

Rename levels in multiple specific factors in a dataframe

Just specify the column numbers you want to apply the revalue function to:

cols_to_update <- c(1:2,4:5)
DF[, cols_to_update] <- lapply(DF[,cols_to_update], function(x) plyr::revalue(x, c("No"="X")))


Related Topics



Leave a reply



Submit