How to Replace Na Values in a Table For Selected Columns

How to replace NA values in a table for selected columns

You can do:

x[, 1:2][is.na(x[, 1:2])] <- 0

or better (IMHO), use the variable names:

x[c("a", "b")][is.na(x[c("a", "b")])] <- 0

In both cases, 1:2 or c("a", "b") can be replaced by a pre-defined vector.

Replace NAs in certain columns

Here is a dplyr solution with mutate_each. You can use the same selectors you use in select() to restrict it to the columns you want. Here I use ends_with.

data1 <- mutate_each(data1, funs=funs(ifelse(is.na(.),0,.)), ends_with("kb"))

Edit: recent version of dplyr have soft deprecated *_each() functions. Using across(). See this answer for an example. Here, the new answer would be:

data1 <- mutate(data1, across(ends_with("kb"), ~ifelse(is.na(.x),0,.x)))

replace NA is selected columns with replace_na

Assuming you have other columns in the data as well but want to change just the three columns, you can do this:

library(dplyr) 

df %>% mutate_at(vars(hh_c22j, hh_r02a, hh_r02b), list(~ replace(., which(is.na(.)), 0)))

# Alternatively, using replace_na
df %>% mutate_at(vars(hh_c22j, hh_r02a, hh_r02b), list(~ replace_na(., 0)))

Just for future reference, a small reproducible sample would go a long way to get better answers!

Replace only some NA values for selected rows and for only a column in R

df$type[!df$Asked & is.na(df$type)] <- "Replies" gets you to your desired table:

> type <-
+ c(NA, rep("Question",3), NA, NA, rep("Answer",4), rep(NA, 3), rep("Answer",2),
+ NA, "Question", NA, rep("Answer",2), NA,NA)
> Asked <- c(
+ T, rep(F, 9), T, rep(F, 4), T, rep(F, 4), T,F
+ )
> df <- data.frame(title = 1:22, comments = 1:22, type, Asked)
> df$type[!df$Asked & is.na(df$type)] <- "Replies"
> df
title comments type Asked
1 1 1 <NA> TRUE
2 2 2 Question FALSE
3 3 3 Question FALSE
4 4 4 Question FALSE
5 5 5 Replies FALSE
6 6 6 Replies FALSE
7 7 7 Answer FALSE
8 8 8 Answer FALSE
9 9 9 Answer FALSE
10 10 10 Answer FALSE
11 11 11 <NA> TRUE
12 12 12 Replies FALSE
13 13 13 Replies FALSE
14 14 14 Answer FALSE
15 15 15 Answer FALSE
16 16 16 <NA> TRUE
17 17 17 Question FALSE
18 18 18 Replies FALSE
19 19 19 Answer FALSE
20 20 20 Answer FALSE
21 21 21 <NA> TRUE
22 22 22 Replies FALSE

How to replace NA's in numerical columns with the median of those columns?

Here are several approaches. The test data frame DF is defined in (1) and used in the other approaches as well.

1) dplyr - across/coalesce

library(dplyr)

# test data
DF <- data.frame(a = c(NA, NA, 1, 2), b = 1:4, c = letters[1:4])

DF %>%
mutate(across(where(is.numeric), ~ coalesce(., median(., na.rm = TRUE))))

giving:

    a b c
1 1.5 1 a
2 1.5 2 b
3 1.0 3 c
4 2.0 4 d

2) dplyr/tidyr - across/replace_na

library(dplyr)
library(tidyr)

DF %>%
mutate(across(where(is.numeric), ~ replace_na(., median(., na.rm = TRUE))))

3) zoo - na.aggregate

library(zoo)

ok <- sapply(DF, is.numeric)
replace(DF, ok, na.aggregate(DF[ok], FUN = median))

4) Base R

na.median <- function(x) replace(x, is.na(x), median(x, na.rm = TRUE))   
ok <- sapply(DF, is.numeric)
replace(DF, ok, lapply(DF[ok], na.median))

5) Base R - S3

na.median <- function(x, ...) UseMethod("na.median")
na.median.default <- identity
na.median.numeric <- function(x, ...) {
replace(x, is.na(x), median(x, na.rm = TRUE))
}

replace(DF, TRUE, lapply(DF, na.median))

6) magrittr We first make a copy of DF to avoid clobbering it -- although not recommend you can just use DF in last line if you are ok with overwriting it -- and use magrittr %<>%. na.median is from (4).

library(magrittr)

DF2 <- DF
DF2[sapply(DF2, is.numeric)] %<>% lapply(na.median)

7) collapse - ftmv ftmv or its synonym ftransformv provide a compact expression. This uses na.median is from (4).

library(collapse)

tfmv(DF, is.numeric, na.median)


Related Topics



Leave a reply



Submit