R how to change one of the level to NA
Set the level to NA:
x <- factor(c("a", "b", "c", "NotPerformed"))
x
## [1] a b c NotPerformed
## Levels: a b c NotPerformed
levels(x)[levels(x)=='NotPerformed'] <- NA
x
## [1] a b c <NA>
## Levels: a b c
Note that the factor level is removed.
fct_recode replace level by NA
You need to use backticks to turn the values to NA
:
x1 <- fct %>% forcats::fct_recode(`NA` = "a")
x1
#[1] NA b
#Levels: NA b
However, note that although this "looks" like NA
it is not real NA
. It is string "NA"
.
is.na(x1)
#[1] FALSE FALSE
x1 == 'NA'
#[1] TRUE FALSE
To make it real NA
replace it to NULL
.
x2 <- fct %>% forcats::fct_recode(NULL = "a")
x2
#[1] <NA> b
#Levels: b
is.na(x2)
#[1] TRUE FALSE
Replace NA in a series of variables by a factor level
If these are already existing factors, you can use forcats::fct_explicit_na()
:
library(dplyr)
library(forcats)
# Make sample data vars factors
dat <- dat %>%
mutate(across(starts_with("s_"), as.factor))
# Add 'No' as factor level
dat %>%
mutate(across(starts_with("s_"), fct_explicit_na, "No"))
# A tibble: 10 x 6
id x s_0 s_1 s_2 s_3
<dbl> <dbl> <fct> <fct> <fct> <fct>
1 1 5 75 A 4 110
2 2 9 36 No No 921
3 3 11 13 B 7 769
4 4 11 34 C 2 912
5 5 11 No C No 835
6 6 13 39 No 4 No
7 7 14 45 B 4 577
8 8 19 42 D 6 No
9 9 20 4 No 7 577
10 10 13 28 No 3 573
Replace unwanted values of factor level with NA
Try this:
df <- data.frame(a=11:18, col=c("C", "", "A", NA, "A", "", "C", NA))
levels(df$col) # "" "A" "C"
sum(is.na(df$col)) # 2
df$col <- factor(df$col, levels=LETTERS[1:3])
levels(df$col) # "A" "B" "C"
sum(is.na(df$col)) # 4
Since the new levels do not include blank (""), all blanks will become NA.
How to use NA as a factor level and rename it in R?
We can try
lapply(data, function(x) {
if(anyNA(x)) {
levels(x) <- c(levels(x), "Missing")
x[is.na(x)] <- "Missing"
x}
else x
})
#$a
#[1] 1 1 2 2 3 Missing Missing
#Levels: 1 2 3 Missing
#$b
#[1] a b b
#Levels: a b
#$c
#[1] 3 4 Missing 3
#Levels: 3 4 Missing
Convert NA into a factor level
You can use addNA()
.
x <- c(1, 1, 2, 2, 3, NA)
addNA(x)
# [1] 1 1 2 2 3 <NA>
# Levels: 1 2 3 <NA>
This is basically a convenience function for factoring with exclude = NULL
. From help(factor)
-
addNA
modifies a factor by turningNA
into an extra level (so thatNA
values are counted in tables, for instance).
So another reason this is nice is because if you already have a factor f
, you can use addNA()
to quickly add NA
as a factor level without changing f
. As mentioned in the documentation, this is handy for tables. It also reads nicely.
Replace NA in a factor column
1) addNA If fac
is a factor addNA(fac)
is the same factor but with NA added as a level. See ?addNA
To force the NA level to be 88:
facna <- addNA(fac)
levels(facna) <- c(levels(fac), 88)
giving:
> facna
[1] 1 2 3 3 4 88 2 4 88 3
Levels: 1 2 3 4 88
1a) This can be written in a single line as follows:
`levels<-`(addNA(fac), c(levels(fac), 88))
2) factor It can also be done in one line using the various arguments of factor
like this:
factor(fac, levels = levels(addNA(fac)), labels = c(levels(fac), 88), exclude = NULL)
2a) or equivalently:
factor(fac, levels = c(levels(fac), NA), labels = c(levels(fac), 88), exclude = NULL)
3) ifelse Another approach is:
factor(ifelse(is.na(fac), 88, paste(fac)), levels = c(levels(fac), 88))
4) forcats The forcats package has a function for this:
library(forcats)
fct_explicit_na(fac, "88")
## [1] 1 2 3 3 4 88 2 4 88 3
## Levels: 1 2 3 4 88
Note: We used the following for input fac
fac <- structure(c(1L, 2L, 3L, 3L, 4L, NA, 2L, 4L, NA, 3L), .Label = c("1",
"2", "3", "4"), class = "factor")
Update: Have improved (1) and added (1a). Later added (4).
Related Topics
Fast Way of Getting Index of Match in List
Alternate Geom_Text Position with Hjust
A^K for Matrix Multiplication in R
How to Escape Characters in Variable Names
Plot Only a Select Few Facets in Facet_Grid
Adding Regression Line Equation and R2 on Separate Lines Graph
R: How to Create a Vector of Functions
Linear Models in R with Different Combinations of Variables
Index Element from List in Rcpp
How to Plot Mean and Standard Error in Boxplot in R
Sine Curve Fit Using Lm and Nls in R
How to Determine the Geom Type of Each Layer of a Ggplot2 Object
R Memory Management Advice (Caret, Model Matrices, Data Frames)