Combine/Merge Columns While Avoiding Na

Combine/merge columns while avoiding NA?

Here's one approach:

> transform(test3, C=rowSums(test3, na.rm=TRUE))
   A  B C
1  1 NA 1
2  2 NA 2
3 NA  3 3
4  4 NA 4

Consider the following data.frame test3 with an additional column AA, you can use the operator [ to subet the columns you are interested in:

> set.seed(1) # adding a new column
> test3$AA <- rnorm(4, 10, 1)
> test3  # this is how test3 looks like
   A  B        AA
1  1 NA  9.373546
2  2 NA 10.183643
3 NA  3  9.164371
4  4 NA 11.595281
> transform(test3, C=rowSums(test3[, c("A", "B")], na.rm=TRUE))
   A  B        AA C
1  1 NA  9.373546 1
2  2 NA 10.183643 2
3 NA  3  9.164371 3
4  4 NA 11.595281 4

Combine column to remove NA's

A dplyr::coalesce based solution could be as:

data %>% mutate(mycol = coalesce(x,y,z)) %>%
         select(a, mycol)
#   a mycol
# 1 A     1
# 2 B     2
# 3 C     3
# 4 D     4
# 5 E     5

Data

data <- data.frame('a' = c('A','B','C','D','E'),
                 'x' = c(1,2,NA,NA,NA),
                 'y' = c(NA,NA,3,NA,NA),
                 'z' = c(NA,NA,NA,4,5))

Combining more than 2 columns by removing NA's in R

You can use apply for this. If df is your dataframe`:

df2 <- apply(df,1,function(x) x[!is.na(x)])
df3 <- data.frame(t(df2))
colnames(df3) <- colnames(df)[1:ncol(df3)]

Output:

#      col1 col2
#         1   13
#        10   18
#         7   15
#         4   16

How to combine multiple character columns into one columns and remove NA without knowing column numbers

Here is a base R method

input$ALL <- apply(input[-1], 1, function(x) paste(na.omit(x), collapse=" "))
input$ALL
#[1] "tv"     "web"    "book"   "web tv"

Merge two columns containing NA values in complementing rows

We can try using the coalesce function from the dplyr package:

df$merged <- coalesce(df$x, df$y)
df$flag <- ifelse(is.na(df$y), 0, 1)
df

   x  y merged flag
1  1 NA      1    0
2 NA  2      2    1
3 NA  3      3    1
4  4 NA      4    0
5  5 NA      5    0
6 NA  6      6    1

Combining columns, while ignoring duplicates and NAs

if 'df1' is the output, then we remove the 'NA' that follows a - with sub

df1 %>% 
    mutate(Var3 = sub("-NA", "", Var3))
# A tibble: 8 x 4
#     id  Var1  Var2        Var3
#  <chr> <chr> <chr>       <chr>
#1     A    A1    A1          A1
#2     B    F2    A2       A2-F2
#3     C  <NA>    A3          A3
#4     D A4-E9    A4       A4-E9
#5     E    E5    A5       A5-E5
#6     F  <NA>  <NA>          NA
#7     G B2-R4 A3-B2    A3-B2-R4
#8     H B3-B4 E1-G5 B3-B4-E1-G5

We can also do this slightly differently with tidyverse by gather into 'long' format, then split the 'value' column using separate_rows, grouped by 'id', summarise the 'Var3' column by pasteing the sorted unique elements of 'Var3' and left_join with the original dataset 'df'

library(tidyverse)
gather(df, key, value, -id) %>%
       separate_rows(value)  %>%
       group_by(id) %>% 
       summarise(Var3 = paste(sort(unique(value)), collapse='-')) %>% 
       mutate(Var3 = replace(Var3, Var3=='', NA)) %>% 
       left_join(df, .)
#   id  Var1  Var2        Var3
#1  A    A1    A1          A1
#2  B    F2    A2       A2-F2
#3  C  <NA>    A3          A3
#4  D A4-E9    A4       A4-E9
#5  E    E5    A5       A5-E5
#6  F  <NA>  <NA>        <NA>
#7  G B2-R4 A3-B2    A3-B2-R4
#8  H B3-B4 E1-G5 B3-B4-E1-G5

NOTE: The %>% makes even a simple code to appear in multiple lines, but if required, we can put all those statements in a single line and term as one-liner

Here is a one-liner

library(data.table)
setDT(df)[, Var3 := paste(sort(unique(unlist(strsplit(unlist(.SD),"-")))), collapse="-"), id]

How to join (merge) columns in a data frame, replacing NA values

The package tidyr has the function unite which does the trick:

#Sample Data
#dput(d)    
d<-structure(list(A = c(1.4, -1.17, -0.85, -0.74, 0.58, 1.29), B = c("Fria Moderada", 
          "Fria Debil", NA, NA, NA, NA), C = c(NA, NA, NA, NA, "Calida Debil", 
          "Calida Moderada"), D = c(NA, NA, "Neutro", "Neutro", NA, NA)), .Names = c("A", 
          "B", "C", "D"), class = "data.frame", row.names = c(NA, -6L))

library(tidyr)
d[is.na(d)]<-"" #removes the NAs
unite(d, newcol, c(B, C, D), sep="")

Combine/Merge Columns While Avoiding Na