How to Replace Empty String with Na in R Dataframe

How to replace empty string with NA in R dataframe?

I'm not sure why df[df==""]<-NA would not have worked for OP. Let's take a sample data.frame and investigate options.

Option#1: Base-R

df[df==""]<-NA

df
# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC

Option#2: dplyr::mutate_all and na_if. Or mutate_if if the data frame has multiple types of columns

library(dplyr)

mutate_all(df, list(~na_if(.,"")))

OR

#if data frame other types of character Then
df %>% mutate_if(is.character, list(~na_if(.,"")))

# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC

Toy Data:

df <- data.frame(One=c("A","","C"), 
Two=c("A","B",""),
Three=c("","BA","CC"),
Four=c("AAA","","CCC"),
stringsAsFactors = FALSE)

df
# One Two Three Four
# 1 A A AAA
# 2 B BA
# 3 C CC CCC

How to replace empty strings in a dataframe with NA (missing value) not NA string

By specifying just NA, according to ?NA -"NA is a logical constant of length 1 which contains a missing value."

The class can be checked

class(NA)
#[1] "logical"
class(NA_character_)
#[1] "character"

and both of them is identified by standard functions such as is.na

is.na(NA)
#[1] TRUE
is.na(NA_character_)
#[1] TRUE

The if_else is type sensitive, so instead of specifying as NA which returns a logical output, it can specified as either NA_real_, NA_integer_, NA_character_ depending on the type of the 'boat' column. Assuming that the 'boat' is character class, we may need NA_character_

titanic %>% 
mutate(boat = if_else(boat=="", NA_character_ ,boat))

Change the Blank Cells to NA

I'm assuming you are talking about row 5 column "sex." It could be the case that in the data2.csv file, the cell contains a space and hence is not considered empty by R.

Also, I noticed that in row 5 columns "axles" and "door", the original values read from data2.csv are string "NA". You probably want to treat those as na.strings as well. To do this,

dat2 <- read.csv("data2.csv", header=T, na.strings=c("","NA"))

EDIT:

I downloaded your data2.csv. Yes, there is a space in row 5 column "sex". So you want

na.strings=c(""," ","NA")

Replace all string instances of NULL with actual NULL or NA in a data frame

Just do this:

exampledf[exampledf=="NULL"] <- NA

or with dplyr

exampledf <- exampledf %>% replace(exampledf == "NULL", NA)

Replace missing values (NA) with blank (empty string)

Another alternative:

df <- sapply(df, as.character) # since your values are `factor`
df[is.na(df)] <- 0

If you want blanks instead of zeroes

> df <- sapply(df, as.character)
> df[is.na(df)] <- " "
> df
class Year1 Year2 Year3 Year4 Year5
[1,] "classA" "A" "A" "A" "A" "A"
[2,] " " " " " " " " " " " "
[3,] "classB" "B" "B" "B" "B" "B"

If you want a data.frame, then just use as.data.drame

> as.data.frame(df)
class Year1 Year2 Year3 Year4 Year5
1 classA A A A A A
2
3 classB B B B B B

How to replace blank strings with NA?

To show that the code works:

data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], ""))
is.na(data) <- data==''
data
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>

Suppose, if you have '' along with spaces ' ', this won't work

 data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], " "))
data1 <- data
is.na(data) <- data==''
data
col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d

In such cases, you could use str_trim

  library(stringr)
data1[] <- lapply(data1, str_trim)
is.na(data1) <- data1==''
data1
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>

Replace blank with NA in R

What type of variable are we talking about? Numeric? Character?
A better formulated question makes it easier to give a better answer.

This could help:

DT[DT == ""] <- NA

Do not try so hard. R should be fun!

R mutate & replace string with empty pattern or empty string

Try replacing the blank values with NA :

library(dplyr)
library(stringr)

temp %>%
mutate(to_remove = na_if(to_remove, ''),
removed = str_remove(entry,to_remove))


Related Topics



Leave a reply



Submit