How to Use Map from Purrr with Dplyr::Mutate to Create Multiple New Columns Based on Column Pairs

How to use map from purrr with dplyr::mutate to create multiple new columns based on column pairs

Here is one option with purrr. We get the unique prefix of the names of the dataset ('nm1'), use map (from purrr) to loop through the unique names, select the column that matches the prefix value of 'nm1', add the rows using reduce and the bind the columns (bind_cols) with the original dataset

library(tidyverse)
nm1 <- names(df) %>% 
          substr(1, 1) %>%
          unique 
nm1 %>% 
     map(~ df %>% 
            select(matches(.x)) %>%
            reduce(`+`)) %>%
            set_names(paste0("sum_", nm1)) %>%
     bind_cols(df, .)
#    a1 b1 c1 a2 b2 c2 sum_a sum_b sum_c
#1  1  4 10  9  3 15    10     7    25
#2  2  5 11 10  4 16    12     9    27
#3  3  6 12 11  5 17    14    11    29
#4  4  7 13 12  6 18    16    13    31
#5  5  8 14 13  7 19    18    15    33

Apply function to two columns at a time with purrr

We can use pivot_longer to reshape the data, creating a column for each level of ko. Compute the sum, then pivot_wider to get back to your original format:

library(tidyverse)

df %>% 
  mutate(idx = row_number()) %>% 
  pivot_longer(-idx, names_sep = '_', names_to = c('group', 'ko')) %>% 
  pivot_wider(names_from = group, values_from = value) %>% 
  mutate(sum = l2fc + ctrl) %>% 
  pivot_wider(names_from = ko, values_from = c(l2fc, ctrl, sum))

     idx l2fc_ko1 l2fc_ko2 ctrl_ko1 ctrl_ko2 sum_ko1 sum_ko2
   <int>    <dbl>    <dbl>    <dbl>    <dbl>   <dbl>   <dbl>
 1     1  -1.04    -0.710    -0.288   -1.65   -1.33   -2.36 
 2     2   0.0338   0.400    -0.850    0.319  -0.816   0.719
 3     3   2.08     0.723     0.325    0.314   2.40    1.04 
 4     4   0.740   -0.411    -0.307    1.77    0.433   1.36 
 5     5   0.347   -1.57     -0.153    0.657   0.195  -0.915
 6     6  -0.998   -0.145     0.265   -1.95   -0.733  -2.09 
 7     7   2.05    -0.0876   -0.909   -0.190   1.14   -0.278
 8     8   0.0735  -0.134    -2.04    -0.832  -1.96   -0.966
 9     9   1.52     2.37      1.53    -0.596   3.05    1.78 
10    10   1.42    -0.753    -1.61     1.84   -0.194   1.09

How can I use map* and mutate to convert a list into a set of additional columns?

Using !! and := lets you dynamically name columns. Then, we can reduce the list output of map() with reduce(), which left_joins() all the dataframes in the list using the dataset title and id columns.

df_2 <- 
  map(get_concept_list(df),
      ~ mutate(df,
               !!.x := get_concept_info(df, .x))) %>% 
  reduce(left_join, by = c("dataset_title", "dataset_id"))

df_2

# A tibble: 1 x 6
  dataset_title                                               dataset_id           time         gender          c_age       measures
  <chr>                                                       <chr>      <list<df[,2]>> <list<df[,2]>> <list<df[,2]>> <list<df[,2]>>
1 Population estimates - local authority based by single year NM_2002_1        [28 x 2]        [3 x 2]      [121 x 2]        [2 x 2]

How to create multiple columns from combination of pairs of another column?

We may use combn on the levels of the 'marital'

library(dplyr)
library(stringr)
data2 <- combn(levels(data$marital), 2, FUN = \(x) 
   data %>%  
     transmute(!! str_c(x, collapse="_") := 
        case_when(as.character(marital) %in% x~ as.character(marital))), 
          simplify = FALSE) %>%
    bind_cols(data, .)

-output

> head(data2, 2)
# A tibble: 2 × 24
   year marital    age race  rincome        partyid relig denom tvhours `No answer_Neve… `No answer_Sepa… `No answer_Divo… `No answer_Wido… `No answer_Marr… `Never married_…
  <int> <fct>    <int> <fct> <fct>          <fct>   <fct> <fct>   <int> <chr>            <chr>            <chr>            <chr>            <chr>            <chr>           
1  2000 Divorced    48 White $8000 to 9999  Not st… Prot… Bapt…      NA <NA>             <NA>             Divorced         <NA>             <NA>             <NA>            
2  2000 Divorced    25 White Not applicable Not st… None  Not …       1 <NA>             <NA>             Divorced         <NA>             <NA>             <NA>            
# … with 9 more variables: Never married_Divorced <chr>, Never married_Widowed <chr>, Never married_Married <chr>, Separated_Divorced <chr>,
#   Separated_Widowed <chr>, Separated_Married <chr>, Divorced_Widowed <chr>, Divorced_Married <chr>, Widowed_Married <chr>

purrr approach for creating new columns through function with two arguments

You should not use df %>% map2(...) if you are passing .x, .y separately to map2.
is.na(.x) is not correct since .x is character values (like "x", "y" and "z"). I have used df[[.x]] to subset the values.
Since we are not using df %>% ... so row_number() would not work, hence changed it to seq_along.

Here is an approach with map2_dfc to create new columns and we use bind_cols to bind it to original dataframe.

library(dplyr)
library(purrr)

bind_cols(df, map2_dfc(.x = c("x", "y", "z"),
                       .y = sampling,
                       .f = ~tibble(!!paste0(.x, "_random") := 
           if_else(is.na(df[[.x]]), NA_integer_,
as.integer(seq_along(df[[.x]]) %in% sample(which(!is.na(df[[.x]])), size = .y))))))

#            x          y           z x_random y_random z_random
#1          NA 0.02358698 0.222022714       NA        0        1
#2  0.15099912         NA 0.878007560        0       NA        0
#3  0.20228598 0.92222805          NA        0        0       NA
#4  0.10955137 0.68713928 0.485866574        1        1        1
#5  0.57361508 0.56205208 0.367087414        1        1        0
#6  0.30534642 0.75997029 0.006055428        0        0        1
#7  0.76949447 0.78142772 0.279323093        0        0        0
#8  0.07178739 0.73181961 0.187739444        0        0        1
#9  0.52645525 0.48321814 0.213029355        0        1        0
#10 0.30858707 0.20973381 0.450931534        0        0        0

Using purrr::possibly to catch dynamic error messages

You could tweak purrr::possibly() from its original code to return instead of message the error.

Original code:

## > possibly
## function (.f, otherwise, quiet = TRUE) 
## {
##     .f <- as_mapper(.f)
##     force(otherwise)
##     function(...) {
##         tryCatch(.f(...), error = function(e) {
##             if (!quiet) 
##                 message("Error: ", e$message) ## <--- tweak
##             otherwise
##         }, interrupt = function(e) {
##             stop("Terminated by user", call. = FALSE)
##         })
##     }
## }

tweaked function:

possibly2 <- function (.f, otherwise, quiet = TRUE) {
    .f <- as_mapper(.f)
    force(otherwise)
    function(...) {
        tryCatch(.f(...), error = function(e) {
            if (!quiet) 
                return(e$message) ## <-- tweaked
            otherwise
        }, interrupt = function(e) {
            stop("Terminated by user", call. = FALSE)
        })
    }
}

Example:

safer_foo <- possibly2(.f = foo, otherwise = "error",
                       quiet = FALSE ## don't forget to "unquiet"
                       )

## all other objects / code as in your example

Output:

## > output_list
## $A
## # A tibble: 1 x 1
##     sum
##   <int>
## 1    55
## 
## $B
## [1] "No column with name B found."
## 
## $C
## [1] "All values of column C are missing."

edit

Actually, possibly2 carries over code which is no longer needed. Omitting
the unwanted static arguments otherwise and quiet, and skipping the handler for user interrupts, the required code shrinks down to:

possibly2 <- function (.f) {
    .f <- as_mapper(.f)
    function(...) {
        tryCatch(.f(...), error = function(e)  e$message)
    }
}

How to Use Map from Purrr with Dplyr::Mutate to Create Multiple New Columns Based on Column Pairs