Combine/Merge Columns While Avoiding Na

Combine/merge columns while avoiding NA?

Here's one approach:

> transform(test3, C=rowSums(test3, na.rm=TRUE))
A B C
1 1 NA 1
2 2 NA 2
3 NA 3 3
4 4 NA 4

Consider the following data.frame test3 with an additional column AA, you can use the operator [ to subet the columns you are interested in:

> set.seed(1) # adding a new column
> test3$AA <- rnorm(4, 10, 1)
> test3 # this is how test3 looks like
A B AA
1 1 NA 9.373546
2 2 NA 10.183643
3 NA 3 9.164371
4 4 NA 11.595281
> transform(test3, C=rowSums(test3[, c("A", "B")], na.rm=TRUE))
A B AA C
1 1 NA 9.373546 1
2 2 NA 10.183643 2
3 NA 3 9.164371 3
4 4 NA 11.595281 4

Combine column to remove NA's

A dplyr::coalesce based solution could be as:

data %>% mutate(mycol = coalesce(x,y,z)) %>%
select(a, mycol)
# a mycol
# 1 A 1
# 2 B 2
# 3 C 3
# 4 D 4
# 5 E 5

Data

data <- data.frame('a' = c('A','B','C','D','E'),
'x' = c(1,2,NA,NA,NA),
'y' = c(NA,NA,3,NA,NA),
'z' = c(NA,NA,NA,4,5))

Combining more than 2 columns by removing NA's in R

You can use apply for this. If df is your dataframe`:

df2 <- apply(df,1,function(x) x[!is.na(x)])
df3 <- data.frame(t(df2))
colnames(df3) <- colnames(df)[1:ncol(df3)]

Output:

#      col1 col2
# 1 13
# 10 18
# 7 15
# 4 16

How to combine multiple character columns into one columns and remove NA without knowing column numbers

Here is a base R method

input$ALL <- apply(input[-1], 1, function(x) paste(na.omit(x), collapse=" "))
input$ALL
#[1] "tv" "web" "book" "web tv"

Merge two columns containing NA values in complementing rows

We can try using the coalesce function from the dplyr package:

df$merged <- coalesce(df$x, df$y)
df$flag <- ifelse(is.na(df$y), 0, 1)
df

x y merged flag
1 1 NA 1 0
2 NA 2 2 1
3 NA 3 3 1
4 4 NA 4 0
5 5 NA 5 0
6 NA 6 6 1

Combining columns, while ignoring duplicates and NAs

if 'df1' is the output, then we remove the 'NA' that follows a - with sub

df1 %>% 
mutate(Var3 = sub("-NA", "", Var3))
# A tibble: 8 x 4
# id Var1 Var2 Var3
# <chr> <chr> <chr> <chr>
#1 A A1 A1 A1
#2 B F2 A2 A2-F2
#3 C <NA> A3 A3
#4 D A4-E9 A4 A4-E9
#5 E E5 A5 A5-E5
#6 F <NA> <NA> NA
#7 G B2-R4 A3-B2 A3-B2-R4
#8 H B3-B4 E1-G5 B3-B4-E1-G5

We can also do this slightly differently with tidyverse by gather into 'long' format, then split the 'value' column using separate_rows, grouped by 'id', summarise the 'Var3' column by pasteing the sorted unique elements of 'Var3' and left_join with the original dataset 'df'

library(tidyverse)
gather(df, key, value, -id) %>%
separate_rows(value) %>%
group_by(id) %>%
summarise(Var3 = paste(sort(unique(value)), collapse='-')) %>%
mutate(Var3 = replace(Var3, Var3=='', NA)) %>%
left_join(df, .)
# id Var1 Var2 Var3
#1 A A1 A1 A1
#2 B F2 A2 A2-F2
#3 C <NA> A3 A3
#4 D A4-E9 A4 A4-E9
#5 E E5 A5 A5-E5
#6 F <NA> <NA> <NA>
#7 G B2-R4 A3-B2 A3-B2-R4
#8 H B3-B4 E1-G5 B3-B4-E1-G5

NOTE: The %>% makes even a simple code to appear in multiple lines, but if required, we can put all those statements in a single line and term as one-liner


Here is a one-liner

library(data.table)
setDT(df)[, Var3 := paste(sort(unique(unlist(strsplit(unlist(.SD),"-")))), collapse="-"), id]

How to join (merge) columns in a data frame, replacing NA values

The package tidyr has the function unite which does the trick:

#Sample Data
#dput(d)
d<-structure(list(A = c(1.4, -1.17, -0.85, -0.74, 0.58, 1.29), B = c("Fria Moderada",
"Fria Debil", NA, NA, NA, NA), C = c(NA, NA, NA, NA, "Calida Debil",
"Calida Moderada"), D = c(NA, NA, "Neutro", "Neutro", NA, NA)), .Names = c("A",
"B", "C", "D"), class = "data.frame", row.names = c(NA, -6L))

library(tidyr)
d[is.na(d)]<-"" #removes the NAs
unite(d, newcol, c(B, C, D), sep="")


Related Topics



Leave a reply



Submit