How to Remove Nas with the Tidyr::Unite Function

How do I remove NAs with the tidyr::unite function?

You could use regex to remove the NAs after they are created:

library(dplyr)
library(tidyr)

df <- data_frame(a = paste0("A.", rep(1, 3)),
b = " ",
c = c("C.1", "C.3", " "),
d = "D.4", e = "E.5")

cols <- letters[2:4]
df[, cols] <- gsub(" ", NA_character_, as.matrix(df[, cols]))
tidyr::unite(df, new, cols, sep = ",") %>%
dplyr::mutate(new = stringr::str_replace_all(new, 'NA,?', '')) # New line

Output:

# A tibble: 3 x 3
a new e
<chr> <chr> <chr>
1 A.1 C.1,D.4 E.5
2 A.1 C.3,D.4 E.5
3 A.1 D.4 E.5

How to remove missing values (NA) when uniting columns?

You have got couple of problems,

1) the NAs are not reals NA's (Check is.na(df$Parent2))

2) Your columns are factors

While constructing the dataframe use stringsAsFactors = FALSE

df <- data.frame(Name, Postalcode, Parent, Parent2, Parent3, Parent4, 
Parent5, stringsAsFactors = FALSE)

and then replace NA and use unite

library(dplyr)
df %>%
na_if('NA') %>%
tidyr::unite(Parent_full, Parent:Parent5, sep = "|", na.rm = TRUE)

# Name Postalcode Parent_full
#1 Paul 4732 Mother
#2 Edward 9045 Father|Mother
#3 Mary 3476 Mother|Father|Stepmother

If the data is already loaded, we can change them by using mutate_if

df %>%  
mutate_if(is.factor, as.character) %>%
na_if('NA') %>%
tidyr::unite(Parent_full, Parent:Parent5, sep = "|", na.rm = TRUE)

R - Unite without NA values

We can use unite with na.rm

library(tidyverse)
mtcars %>%
rownames_to_column('rn') %>%
mutate_at(vars(starts_with("NA")), as.character) %>%
unite(Var1, NA_1, NA_2, na.rm = TRUE) %>%
mutate(Var1 = na_if(Var1, "")) %>%
column_to_rownames('rn')

Or another option is coalesce instead of unite

mtcars %>%
mutate(Var1 = str_c(coalesce(NA_1, NA_2), coalesce(NA_2, NA_1), sep="_"))

Or another option is

mtcars %>%
mutate_at(vars(starts_with("NA")), list(~ replace_na(., ''))) %>%
mutate(Var1 = str_remove(na_if(str_c(NA_1, NA_2, sep="_"), '_'), '^_|_$') ) %>%
select(-NA_1, NA_2)

How to remove NAs with the conditions in R?

df_A %>%
group_by(product_name) %>%
filter(!is.na(id) |
is.na(id) & is.na(clicks))

Using the unite function in R and removing duplicated values

I am not sure if deduplicating is possible with unite, however you can use apply row-wise.

input$ALL <- apply(input[-1], 1, function(x) toString(na.omit(unique(x))))

Or a tidyverse way could be using pmap

library(tidyverse)

input %>%
mutate(ALL = pmap_chr(select(., -id), ~toString(unique(na.omit(c(...))))))

# id `2017` `2018` `2019` ALL
# <chr> <chr> <chr> <chr> <chr>
#1 aa tv tv NA tv
#2 ss NA web web web
#3 dd NA NA book book
#4 qq web NA tv web, tv

Or getting the data in long format and then joining

input %>%
pivot_longer(cols = -id, values_drop_na = TRUE) %>%
group_by(id) %>%
summarise(ALL = toString(unique(value))) %>%
left_join(input)

Collapsing columns and removing NAs

Using dplyr::coalesce we can do the following:

df %>%
mutate(Comb = coalesce(w,x,y,z)) %>%
select(A, Comb)

which gives the following output:

      A  Comb
<dbl> <dbl>
1 0.23 1
2 0.12 2
3 0.45 2
4 0.89 3
5 0.12 4

Combine column to remove NA's

A dplyr::coalesce based solution could be as:

data %>% mutate(mycol = coalesce(x,y,z)) %>%
select(a, mycol)
# a mycol
# 1 A 1
# 2 B 2
# 3 C 3
# 4 D 4
# 5 E 5

Data

data <- data.frame('a' = c('A','B','C','D','E'),
'x' = c(1,2,NA,NA,NA),
'y' = c(NA,NA,3,NA,NA),
'z' = c(NA,NA,NA,4,5))

Dealing with Spaces and NA's when Uniting Multiple Columns with Tidyr

From getAnywhere("unite_.data.frame"), unite is calling do.call("paste", c(data[from], list(sep = sep))) underhood, and paste as far as I know doesn't provide a functionality to omit NAs unless manually implemented in some way;

Nevertheless, you can use a regular expression method as follows with gsub from base R to clean up the result column:

gsub("^\\s;\\s|;\\s{2}", "", Days$BestDays)
# [1] "Monday" "Tuesday; Wednesday"
# [3] "Tuesday; Wednesday" "Monday; Wednesday"
# [5] "Monday; Tuesday; Thursday; Friday"

This removes either ^\\s;\\s pattern or ;\\s{2} pattern, the former handle the case when the string starts with space string where we can just remove the space and it's following ;\\s, otherwise remove ;\\s{2} which can handle cases where \\s are both in the middle of the string and at the end of the string.



Related Topics



Leave a reply



Submit