Replace All 0 Values to Na

Replace all 0 values to NA

Replacing all zeroes to NA:

df[df == 0] <- NA



Explanation

1. It is not NULL what you should want to replace zeroes with. As it says in ?'NULL',

NULL represents the null object in R

which is unique and, I guess, can be seen as the most uninformative and empty object.1 Then it becomes not so surprising that

data.frame(x = c(1, NULL, 2))
# x
# 1 1
# 2 2

That is, R does not reserve any space for this null object.2 Meanwhile, looking at ?'NA' we see that

NA is a logical constant of length 1 which contains a missing value
indicator. NA can be coerced to any other vector type except raw.

Importantly, NA is of length 1 so that R reserves some space for it. E.g.,

data.frame(x = c(1, NA, 2))
# x
# 1 1
# 2 NA
# 3 2

Also, the data frame structure requires all the columns to have the same number of elements so that there can be no "holes" (i.e., NULL values).

Now you could replace zeroes by NULL in a data frame in the sense of completely removing all the rows containing at least one zero. When using, e.g., var, cov, or cor, that is actually equivalent to first replacing zeroes with NA and setting the value of use as "complete.obs". Typically, however, this is unsatisfactory as it leads to extra information loss.

2. Instead of running some sort of loop, in the solution I use df == 0 vectorization. df == 0 returns (try it) a matrix of the same size as df, with the entries TRUE and FALSE. Further, we are also allowed to pass this matrix to the subsetting [...] (see ?'['). Lastly, while the result of df[df == 0] is perfectly intuitive, it may seem strange that df[df == 0] <- NA gives the desired effect. The assignment operator <- is indeed not always so smart and does not work in this way with some other objects, but it does so with data frames; see ?'<-'.


1 The empty set in the set theory feels somehow related.

2 Another similarity with the set theory: the empty set is a subset of every set, but we do not reserve any space for it.

How to replace 0 or missing value with NA in R

You could just use replace without any additional function / package:

data <- replace(data, data == 0, NA)

This is now assuming that data is your data frame.

Otherwise you can simply insert the column name, e.g. if your data frame is df and column name data:

df$data <- replace(df$data, df$data == 0, NA)

Set 0 to NA in R

Is this what you need?

df <- data.frame(A=c(0, 3, "bla"), B=c("A", 0, "X"), C=c("x","B", 4)) #some fake data
df[df == 0] <- NA

How do I replace NA values with zeros in an R dataframe?

See my comment in @gsk3 answer. A simple example:

> m <- matrix(sample(c(NA, 1:10), 100, replace = TRUE), 10)
> d <- as.data.frame(m)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 4 3 NA 3 7 6 6 10 6 5
2 9 8 9 5 10 NA 2 1 7 2
3 1 1 6 3 6 NA 1 4 1 6
4 NA 4 NA 7 10 2 NA 4 1 8
5 1 2 4 NA 2 6 2 6 7 4
6 NA 3 NA NA 10 2 1 10 8 4
7 4 4 9 10 9 8 9 4 10 NA
8 5 8 3 2 1 4 5 9 4 7
9 3 9 10 1 9 9 10 5 3 3
10 4 2 2 5 NA 9 7 2 5 5

> d[is.na(d)] <- 0

> d
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 4 3 0 3 7 6 6 10 6 5
2 9 8 9 5 10 0 2 1 7 2
3 1 1 6 3 6 0 1 4 1 6
4 0 4 0 7 10 2 0 4 1 8
5 1 2 4 0 2 6 2 6 7 4
6 0 3 0 0 10 2 1 10 8 4
7 4 4 9 10 9 8 9 4 10 0
8 5 8 3 2 1 4 5 9 4 7
9 3 9 10 1 9 9 10 5 3 3
10 4 2 2 5 0 9 7 2 5 5

There's no need to apply apply. =)

EDIT

You should also take a look at norm package. It has a lot of nice features for missing data analysis. =)

R - replacing the zero values in *just one row* of a matrix with NA

A[1, A[1,] == 0] <- NA
A
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] NA NA 1 1 NA NA NA 1 NA
# [2,] 0 0 0 0 1 0 0 0 0
# [3,] 1 0 0 0 0 0 0 0 0
# [4,] 1 0 0 0 0 0 0 0 1
# [5,] 0 1 0 0 0 1 0 0 0
# [6,] 0 0 0 0 1 0 0 0 0
# [7,] 0 0 0 0 0 0 0 0 0
# [8,] 1 0 0 0 0 0 0 0 0
# [9,] 0 0 0 1 0 0 0 0 0

Python Pandas replace multiple columns zero to Nan

I think you need replace by dict:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})

Replace all zero columns with NA

With tidyverse, we can use if/else

library(tidyverse)
df %>%
group_by(ID) %>%
mutate_all(list(~ if(all(.==0)) NA_integer_ else .))
# ID A1 B1
# <fct> <dbl> <dbl>
#1 A NA 1
#2 A NA 0
#3 A NA 1
#4 A NA 0
#5 B 1 NA
#6 B 0 NA
#7 B 1 NA

Or without any if/else

df %>%
group_by(ID) %>%
mutate_all(~ NA^all(!.) * .)

or using data.table

library(data.table)
setDT(df)[, lapply(.SD, function(x) replace(x, all(x == 0), NA)), ID]

Or using base R

by(df[-1], df$ID, FUN = function(x)  x * (NA^ !colSums(!!x))[col(x)])

Replacing zeroes with NA for values preceding non-zero

There are three issues. First, writing:

df <- cbind(stock1,stock2,stock3,stock4)

doesn't create a data frame. It creates a matrix. This is an issue when you try to use lapply, which will operate over the columns of a data frame but over the elements of a matrix. Instead, you should write:

df <- data.frame(stock1,stock2,stock3,stock4)

Second, the function you're using in lapply needs to return the modified vector. Otherwise, the return value will be something unexpected (in this case, the assignment will return a single NA, and the lapply will return a data frame of one row of NAs instead of the data frame you want).

Third, you need to take care with 1:n when n can be zero (i.e., when the first stock quote is non-zero) because 1:0 gives the sequence c(1,0) instead of an empty sequence. (This is arguably one of R's stupidest features.)

Therefore, the following will give you what you want:

stock1 <- c(0.01, -0.02, 0.01, 0.05, 0.04, -0.02)
stock2 <- c(0, 0, 0.02, 0.04, -0.03, 0.02)
stock3 <- c(0, 0, 0.02, 0, -0.01, 0.03)
stock4 <- c(0, -0.02, 0.01, 0, 0, -0.02)
df <- data.frame(stock1,stock2,stock3,stock4)

as.data.frame(lapply(df, function(x) {
n <- min(which(x != 0)) - 1
if (n > 0)
x[1:n] <- NA
x
}))

The output is as expected:

  stock1 stock2 stock3 stock4
1 0.01 NA NA NA
2 -0.02 NA NA -0.02
3 0.01 0.02 0.02 0.01
4 0.05 0.04 0.00 0.00
5 0.04 -0.03 -0.01 0.00
6 -0.02 0.02 0.03 -0.02

Update: As @Daniel_Fischer notes, there's a clever trick to avoid the 1:0 problem. You can instead write:

as.data.frame(lapply(df, function(x) {
n <- min(which(x != 0)) - 1
x[0:n] <- NA # use 0:n instead of 1:n
x
}))

This takes advantage of the fact that R ignores zeros in this type of indexing operation, so:

x[0:0] <- NA    # same as x[0] <- NA and does nothing
x[0:1] <- NA # same as x[1] <- NA
x[0:2] <- NA # same as x[1:2] <- NA, etc.


Related Topics



Leave a reply



Submit