Change the Blank Cells to NA
I'm assuming you are talking about row 5 column "sex." It could be the case that in the data2.csv file, the cell contains a space and hence is not considered empty by R.
Also, I noticed that in row 5 columns "axles" and "door", the original values read from data2.csv are string "NA". You probably want to treat those as na.strings as well. To do this,
dat2 <- read.csv("data2.csv", header=T, na.strings=c("","NA"))
EDIT:
I downloaded your data2.csv. Yes, there is a space in row 5 column "sex". So you want
na.strings=c(""," ","NA")
Function to change blanks to NA
You can directly index fields that match a logical criterion. So you can just write:
df[is_empty(df)] = NA
Where is_empty
is your comparison, e.g. df == ""
:
df[df == ""] = NA
But note that is.null(df)
won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.
1 You’ll almost never encounter NULL
inside a table since that only works if the underlying vector is a list
. You can create matrices and data.frames with this constraint, but then is.null(df)
will never be TRUE
because the NULL
values are wrapped inside the list).
How to replace empty string with NA in R dataframe?
I'm not sure why df[df==""]<-NA
would not have worked for OP. Let's take a sample data.frame and investigate options.
Option#1: Base-R
df[df==""]<-NA
df
# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC
Option#2: dplyr::mutate_all
and na_if
. Or mutate_if
if the data frame has multiple types of columns
library(dplyr)
mutate_all(df, list(~na_if(.,"")))
OR
#if data frame other types of character Then
df %>% mutate_if(is.character, list(~na_if(.,"")))
# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC
Toy Data:
df <- data.frame(One=c("A","","C"),
Two=c("A","B",""),
Three=c("","BA","CC"),
Four=c("AAA","","CCC"),
stringsAsFactors = FALSE)
df
# One Two Three Four
# 1 A A AAA
# 2 B BA
# 3 C CC CCC
Replace blank with NA in R
What type of variable are we talking about? Numeric? Character?
A better formulated question makes it easier to give a better answer.
This could help:
DT[DT == ""] <- NA
Do not try so hard. R should be fun!
How to replace blank strings with NA?
To show that the code works:
data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], ""))
is.na(data) <- data==''
data
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>
Suppose, if you have ''
along with spaces ' '
, this won't work
data <- data.frame( col1= c("", letters[1:4]), col2=c(letters[1:4], " "))
data1 <- data
is.na(data) <- data==''
data
col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d
In such cases, you could use str_trim
library(stringr)
data1[] <- lapply(data1, str_trim)
is.na(data1) <- data1==''
data1
# col1 col2
#1 <NA> a
#2 a b
#3 b c
#4 c d
#5 d <NA>
How to replace NA's with blank value?
If your column is of type double (numbers), you can't replace NAs (which is the R internal for missings) by a character string. And ""
IS a character string even though you think it's empty, but it is not.
So you need to choose: converting you whole column to type character or leave the missings as NA.
EDIT:
- If you really want to covnert your numeric column to character, you can just use
as.character(MYCOLUMN)
. But I think what you really want is: - Telling your exporting function how to treat NA'S, which is easy, e.g.
write.csv(df, na = "")
. Also check the help function with?write.csv
.
Fast way to replace all blanks with NA in R data.table
Here's probably the generic data.table
way of doing this. I'm also going to use your regex which handles several types of blanks (I havn't seen other answers doing this). You probably shouldn't run this over all your columns rather only over the factor
or character
ones, because other classes won't accept blank values.
For factor
s
indx <- which(sapply(data, is.factor))
for (j in indx) set(data, i = grep("^$|^ $", data[[j]]), j = j, value = NA_integer_)
For character
s
indx2 <- which(sapply(data, is.character))
for (j in indx2) set(data, i = grep("^$|^ $", data[[j]]), j = j, value = NA_character_)
How can I replace empty cells with NA in R?
The result you get from readHTMLTable
is giving you a list of two tables, so you need to work on each list element, which can be done using lapply
table <- lapply(table, function(x){
x[x == ""] <- NA
return(x)
})
table$team_stats
Player PF Yds Ply Y/P TO FL 1stD Cmp Att Yds TD Int NY/A 1stD Att Yds TD Y/A 1stD Pen Yds 1stPy
1 Team Stats 442 6268 1021 6.1 25 14 350 339 483 4302 35 11 8.1 209 493 1966 14 4.0 124 109 922 17
2 Opp. Stats 253 4618 979 4.7 37 16 283 316 564 3235 15 21 5.3 178 372 1383 9 3.7 76 75 581 29
3 Lg Rank Offense 1 1 <NA> <NA> 2 10 1 <NA> 20 2 1 1 1 <NA> 13 10 12 13 <NA> <NA> <NA> <NA>
4 Lg Rank Defense 3 4 <NA> <NA> 11 9 9 <NA> 25 11 3 9 5 <NA> 1 3 3 8 <NA> <NA> <NA> <NA>
How to replace empty strings in a dataframe with NA (missing value) not NA string
By specifying just NA
, according to ?NA
-"NA is a logical constant of length 1 which contains a missing value."
The class
can be checked
class(NA)
#[1] "logical"
class(NA_character_)
#[1] "character"
and both of them is identified by standard functions such as is.na
is.na(NA)
#[1] TRUE
is.na(NA_character_)
#[1] TRUE
The if_else
is type sensitive, so instead of specifying as NA
which returns a logical output, it can specified as either NA_real_
, NA_integer_
, NA_character_
depending on the type of the 'boat' column. Assuming that the 'boat' is character
class, we may need NA_character_
titanic %>%
mutate(boat = if_else(boat=="", NA_character_ ,boat))
Replace strings containing only blanks with NA
One dplyr
possibility could be:
df %>%
mutate_all(~ ifelse(nchar(trimws(.)) == 0, NA_character_, .))
Q1 Q2
1 Test test Sample sample
2 Test <NA>
3 <NA> Sample
4 <NA> Sample
Or the same with base R
:
df[] <- lapply(df, function(x) ifelse(nchar(trimws(x)) == 0, NA_character_, x))
Or:
df %>%
mutate_all(~ trimws(.)) %>%
na_if(., "")
Related Topics
Multiple Comboboxes in R Using Tcltk
R Dplyr Mutate, Calculating Standard Deviation for Each Row
How to Define "Hidden Global Variables" Inside R Packages
How Does The Subset Argument Work in The Lm() Function
How to Draw Arrow in Ggplot2 with Annotation
Store Output from Gridextra::Grid.Arrange into an Object
Creating Categorical Variables from Mutually Exclusive Dummy Variables
How to Plot Multiple Lines in R
What's The Difference Between [1], [1,], [,1], [[1]] for a Dataframe in R
How to Install/Locate R.H and Rmath.H Header Files
How to Get Proportions and Counts of a Data Frame in R
Remove Certain Words in String from Column in Dataframe in R
Download Multiple CSV Files with One Button (Downloadhandler) with R Shiny
Make a Boxplot Without Whiskers
R Markdown Add Tag to Head of HTML Output
Get Country (And Continent) from Longitude and Latitude Point in R