Remove Na/Nan/Inf in a Matrix

How to remove rows with inf from a dataframe in R

To remove the rows with +/-Inf I'd suggest the following:

df <- df[!is.infinite(rowSums(df)),]

or, equivalently,

df <- df[is.finite(rowSums(df)),]

The second option (the one with is.finite() and without the negation) removes also rows containing NA values in case that this has not already been done.

Python pandas: how to remove nan and -inf values

Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]

time X Y X_t0 X_tp0 X_t1 X_tp1 X_t2 X_tp2
4 0.037389 3 10 3 0.333333 2.0 0.500000 1.0 1.000000
5 0.037393 4 10 4 0.250000 3.0 0.333333 2.0 0.500000
1030308 9.962213 256 268 256 0.000000 256.0 0.003906 255.0 0.003922

Delete the rows of matrix with Inf

Couple of suggestions regarding the problem.

1) It is better not to name a matrix object as matrix.

2) NaN or NA have a special meaning and are not character strings. By using quotes "NaN", it becomes difficult to apply the custom functions is.nan/is.na to do any manipulations. So, we have to resort to ==/!=

3) It is not clear why the individual list are cbinded to a matrix.


Based on the input data, we can loop through the columns of 'matrix' with apply, then loop through each of the list elements, check whether we have a finite element and is not a "NaN", get the rowSums, negate (! - converts the 0 elements to TRUE i.e. all the elements in the row are finite and all other values to FALSE). Use the logical index to subset the rows.

 matrix[!rowSums(apply(matrix, 2, FUN = function(x) 
sapply(x, function(y) !(is.finite(y) & y !="NaN")))),]
# v1 v2 v3
#[1,] 1 1 1
#[2,] 3 3 3
#[3,] 4 4 4
#[4,] 6 6 6

How to remove nan and inf values from a numpy matrix?

You can just replace NaN and infinite values with the following mask:

output[~np.isfinite(output)] = 0

>>> output
array([[1. , 0.5 , 1. , 1. , 0. ,
1. ],
[1. , 1. , 0.5 , 1. , 0.46524064,
1. ],
[1. , 1. , 1. , 0. , 1. ,
1. ]])

How can I remove rows with inf from my dataframe in R?

To remove rows with Inf values you can use :

ICS_data[rowSums(sapply(ICS_data[-ncol(ICS_data)], is.infinite)) == 0, ]

Or using dplyr :

library(dplyr)
ICS_data %>% filter_at(-ncol(.), all_vars(is.finite(.)))

We can break the code into smaller steps to understand how it works.

Consider this data.

data <- data.frame(a = 1:4, b = 2:5, c = letters[1:4], stringsAsFactors = TRUE)
data$b[2] <- Inf
data
# a b c
#1 1 2 a
#2 2 Inf b
#3 3 4 c
#4 4 5 d

First we remove the last column from data. We remove that since the last column is factor as we don't want to include that to find infinite values. So we get only numeric columns.

data[-ncol(data)]

# a b
#1 1 2
#2 2 Inf
#3 3 4
#4 4 5

Next using sapply we find out in each column which value are infinite using is.infinite. This returns back a matrix with TRUE/FALSE values.

sapply(data[-ncol(data)], is.infinite)

# a b
#[1,] FALSE FALSE
#[2,] FALSE TRUE
#[3,] FALSE FALSE
#[4,] FALSE FALSE

We can sum these logical values using rowSums. Here TRUE is considered as 1 and FALSE as 0.

rowSums(sapply(data[-ncol(data)], is.infinite))
#[1] 0 1 0 0

Using this we come to know that the second row has 1 infinite value and we need to drop that. So we select rows which has 0 infinite value.

data[rowSums(sapply(data[-ncol(data)], is.infinite)) == 0, ]

# a b c
#1 1 2 a
#3 3 4 c
#4 4 5 d

How to remove NaN and Inf values from data.table where all columns are character types in R

One way would be to find the index of the rows containing NaN:

unique(which(data == "NaN" | data == "Inf", arr.ind=T)[,1])
[1]  1  2  7  8  9 10 11

And then set a logical condition to remove these rows:

data[!unique(which(data == "NaN" | data == "Inf", arr.ind=T)[,1])]
         date open high  low close volume
1: 2021-11-26 0.43 0.43 0.43 0.43 2
2: 2021-11-24 0.17 0.17 0.17 0.17 10
3: 2021-11-26 0.19 0.19 0.19 0.19 75
4: 2021-11-24 0.15 0.15 0.15 0.15 1

Some benchmarks

Unit: milliseconds
expr min lq mean median uq max neval cld
me 4.513141 5.545293 7.068744 6.707279 8.356170 31.30188 100 a
langtang 3.535727 3.646819 8.718629 6.318445 6.983275 59.76049 100 a
akrun 51.169168 195.102026 208.889413 204.564707 216.545022 274.02575 100 c
paul 11.235627 145.195062 146.721146 146.670909 148.432261 200.56718 100 b
Macosso 370.269687 448.143027 468.074160 457.499264 497.636319 553.70491 100 d
data = structure(list(date = c("2021-11-24", "2021-11-24", "2021-11-26", 
"2021-11-24", "2021-11-26", "2021-11-24", "2021-11-24", "2021-11-26",
"2021-11-26", "2021-11-26", "2021-11-26"), open = c("NaN", "NaN",
"0.43", "0.17", "0.19", "0.15", "NaN", "NaN", "NaN", "NaN", "NaN"
), high = c("NaN", "NaN", "0.43", "0.17", "0.19", "0.15", "NaN",
"NaN", "NaN", "NaN", "NaN"), low = c("NaN", "NaN", "0.43", "0.17",
"0.19", "0.15", "NaN", "NaN", "NaN", "NaN", "NaN"), close = c("NaN",
"NaN", "0.43", "0.17", "0.19", "0.15", "NaN", "NaN", "NaN", "NaN",
"NaN"), volume = c(0L, 0L, 2L, 10L, 75L, 1L, 0L, 0L, 0L, 0L,
0L)), row.names = c(NA, -11L), class = c("data.table", "data.frame"
))
data = do.call("rbind", replicate(1000, data, simplify = FALSE))

library(dtplyr)

res = microbenchmark::microbenchmark(
me = data[!unique(which(data == NaN, arr.ind=T)[,1])],

langtang = na.omit(cbind(data[, .(date,volume)], data[, lapply(.SD, as.numeric), .SDcols = 2:5])),

akrun = {data <- type.convert(data, as.is = TRUE);
data[data[, Reduce(`&`, lapply(.SD, function(x)
!is.nan(x) & is.finite(x))), .SDcols = -1]]},

paul = data %>%
lazy_dt %>%
filter(across(2:5, ~ .x != "NaN")) %>%
as.data.table,

Macosso = {data$Row <- row.names(data);
rm_rw <- data[apply(data, 1,
function(X) any(X== "NaN"|X== "Inf")),] %>% pull(Row);
data[!row.names(data) %in% rm_rw ,] %>% select(-Row)
}

)

Remove infinite values from a matrix in R

Use is.finite. I presume this is how you wish to "remove" those -Inf values:

m[!is.finite(m)] <- NA
colMeans(m, na.rm=TRUE)

Replacing NaN/Inf/NA to 0 in a list object

We can replace the non finite values to 0 by looping over the list elements with lapply

lapply(dat, function(x) replace(x, !is.finite(x), 0))
# [,1] [,2]
#[1,] 2 0
#[2,] 0 0

#$b
# [,1] [,2]
#[1,] -3 0
#[2,] 0 1


Related Topics



Leave a reply



Submit