Replace missing values (NA) with blank (empty string)
Another alternative:
df <- sapply(df, as.character) # since your values are `factor`
df[is.na(df)] <- 0
If you want blanks instead of zeroes
> df <- sapply(df, as.character)
> df[is.na(df)] <- " "
> df
class Year1 Year2 Year3 Year4 Year5
[1,] "classA" "A" "A" "A" "A" "A"
[2,] " " " " " " " " " " " "
[3,] "classB" "B" "B" "B" "B" "B"
If you want a data.frame, then just use as.data.drame
> as.data.frame(df)
class Year1 Year2 Year3 Year4 Year5
1 classA A A A A A
2
3 classB B B B B B
How to replace empty strings in a dataframe with NA (missing value) not NA string
By specifying just NA
, according to ?NA
-"NA is a logical constant of length 1 which contains a missing value."
The class
can be checked
class(NA)
#[1] "logical"
class(NA_character_)
#[1] "character"
and both of them is identified by standard functions such as is.na
is.na(NA)
#[1] TRUE
is.na(NA_character_)
#[1] TRUE
The if_else
is type sensitive, so instead of specifying as NA
which returns a logical output, it can specified as either NA_real_
, NA_integer_
, NA_character_
depending on the type of the 'boat' column. Assuming that the 'boat' is character
class, we may need NA_character_
titanic %>%
mutate(boat = if_else(boat=="", NA_character_ ,boat))
How to replace NA's with blank value?
If your column is of type double (numbers), you can't replace NAs (which is the R internal for missings) by a character string. And ""
IS a character string even though you think it's empty, but it is not.
So you need to choose: converting you whole column to type character or leave the missings as NA.
EDIT:
- If you really want to covnert your numeric column to character, you can just use
as.character(MYCOLUMN)
. But I think what you really want is: - Telling your exporting function how to treat NA'S, which is easy, e.g.
write.csv(df, na = "")
. Also check the help function with?write.csv
.
How to replace empty string with NA in R dataframe?
I'm not sure why df[df==""]<-NA
would not have worked for OP. Let's take a sample data.frame and investigate options.
Option#1: Base-R
df[df==""]<-NA
df
# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC
Option#2: dplyr::mutate_all
and na_if
. Or mutate_if
if the data frame has multiple types of columns
library(dplyr)
mutate_all(df, list(~na_if(.,"")))
OR
#if data frame other types of character Then
df %>% mutate_if(is.character, list(~na_if(.,"")))
# One Two Three Four
# 1 A A <NA> AAA
# 2 <NA> B BA <NA>
# 3 C <NA> CC CCC
Toy Data:
df <- data.frame(One=c("A","","C"),
Two=c("A","B",""),
Three=c("","BA","CC"),
Four=c("AAA","","CCC"),
stringsAsFactors = FALSE)
df
# One Two Three Four
# 1 A A AAA
# 2 B BA
# 3 C CC CCC
Pandas Replace NaN with blank/empty string
import numpy as np
df1 = df.replace(np.nan, '', regex=True)
This might help. It will replace all NaNs with an empty string.
Replacing blank values (white space) with NaN in pandas
I think df.replace()
does the job, since pandas 0.13:
df = pd.DataFrame([
[-0.532681, 'foo', 0],
[1.490752, 'bar', 1],
[-1.387326, 'foo', 2],
[0.814772, 'baz', ' '],
[-0.222552, ' ', 4],
[-1.176781, 'qux', ' '],
], columns='A B C'.split(), index=pd.date_range('2000-01-01','2000-01-06'))
# replace field that's entirely space (or empty) with NaN
print(df.replace(r'^\s*$', np.nan, regex=True))
Produces:
A B C
2000-01-01 -0.532681 foo 0
2000-01-02 1.490752 bar 1
2000-01-03 -1.387326 foo 2
2000-01-04 0.814772 baz NaN
2000-01-05 -0.222552 NaN 4
2000-01-06 -1.176781 qux NaN
As Temak pointed it out, use df.replace(r'^\s+$', np.nan, regex=True)
in case your valid data contains white spaces.
Replace NA with blank but keep class as numeric in R
First: vectors in R can't contain mixed classes. If you want numbers to be numeric, then missing values have to be NA. If you want missing values to be empty strings, then other values have to be characters.
However, it appears that you want to process data and output it for use with WOMBAT. In this case, the output is a plain ASCII text file. All that's required is that the text format be correct for WOMBAT - the class of the columns in R is not relevant if you are no longer in R.
So you need to read the WOMBAT manual regarding input format, then use write.table
to create the file. Look at write.table
for the options. In particular, you will probably need quote = FALSE
and row.names = FALSE
.
Change the Blank Cells to NA
I'm assuming you are talking about row 5 column "sex." It could be the case that in the data2.csv file, the cell contains a space and hence is not considered empty by R.
Also, I noticed that in row 5 columns "axles" and "door", the original values read from data2.csv are string "NA". You probably want to treat those as na.strings as well. To do this,
dat2 <- read.csv("data2.csv", header=T, na.strings=c("","NA"))
EDIT:
I downloaded your data2.csv. Yes, there is a space in row 5 column "sex". So you want
na.strings=c(""," ","NA")
Replace NA with empty string in a list
You can do this with lapply
:
# Setup sample data frame
dat = list(matrix(c(NA, "a", "b", NA), nrow=2),
matrix(c(rep("r", 8), NA), nrow=3))
dat
# [[1]]
# [,1] [,2]
# [1,] NA "b"
# [2,] "a" NA
#
# [[2]]
# [,1] [,2] [,3]
# [1,] "r" "r" "r"
# [2,] "r" "r" "r"
# [3,] "r" "r" NA
# Do conversion
dat <- lapply(dat, function(x) { x[is.na(x)] <- "" ; x })
dat
# [[1]]
# [,1] [,2]
# [1,] "" "b"
# [2,] "a" ""
#
# [[2]]
# [,1] [,2] [,3]
# [1,] "r" "r" "r"
# [2,] "r" "r" "r"
# [3,] "r" "r" ""
Function to change blanks to NA
You can directly index fields that match a logical criterion. So you can just write:
df[is_empty(df)] = NA
Where is_empty
is your comparison, e.g. df == ""
:
df[df == ""] = NA
But note that is.null(df)
won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.
1 You’ll almost never encounter NULL
inside a table since that only works if the underlying vector is a list
. You can create matrices and data.frames with this constraint, but then is.null(df)
will never be TRUE
because the NULL
values are wrapped inside the list).
Related Topics
Convert Four Digit Year Values to Class Date
How to Change the Color Value of Just One Value in Ggplot2's Scale_Fill_Brewer
How to Call a Function Using the Character String of the Function Name in R
How to Parametrize Function Calls in Dplyr 0.7
How to Select Last N Observation from Each Group in Dplyr Dataframe
Identify All Objects of Given Class for Further Processing
Why Does As.Factor Return a Character When Used Inside Apply
What Is Integer Overflow in R and How Can It Happen
Different Legends and Fill Colours for Facetted Ggplot
Put a Break in the Y-Axis of a Histogram
Cumulative Count of Each Value
Get All Diagonal Vectors from Matrix
Setting Function Defaults R on a Project Specific Basis
How to Map a Vector of Values to Another Vector with My Own Custom Map in R