How to use map from purrr with dplyr::mutate to create multiple new columns based on column pairs
Here is one option with purrr
. We get the unique
prefix of the names
of the dataset ('nm1'), use map
(from purrr
) to loop through the unique names, select
the column that matches
the prefix value of 'nm1', add the rows using reduce
and the bind the columns (bind_cols
) with the original dataset
library(tidyverse)
nm1 <- names(df) %>%
substr(1, 1) %>%
unique
nm1 %>%
map(~ df %>%
select(matches(.x)) %>%
reduce(`+`)) %>%
set_names(paste0("sum_", nm1)) %>%
bind_cols(df, .)
# a1 b1 c1 a2 b2 c2 sum_a sum_b sum_c
#1 1 4 10 9 3 15 10 7 25
#2 2 5 11 10 4 16 12 9 27
#3 3 6 12 11 5 17 14 11 29
#4 4 7 13 12 6 18 16 13 31
#5 5 8 14 13 7 19 18 15 33
Apply function to two columns at a time with purrr
We can use pivot_longer
to reshape the data, creating a column for each level of ko
. Compute the sum, then pivot_wider
to get back to your original format:
library(tidyverse)
df %>%
mutate(idx = row_number()) %>%
pivot_longer(-idx, names_sep = '_', names_to = c('group', 'ko')) %>%
pivot_wider(names_from = group, values_from = value) %>%
mutate(sum = l2fc + ctrl) %>%
pivot_wider(names_from = ko, values_from = c(l2fc, ctrl, sum))
idx l2fc_ko1 l2fc_ko2 ctrl_ko1 ctrl_ko2 sum_ko1 sum_ko2
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 -1.04 -0.710 -0.288 -1.65 -1.33 -2.36
2 2 0.0338 0.400 -0.850 0.319 -0.816 0.719
3 3 2.08 0.723 0.325 0.314 2.40 1.04
4 4 0.740 -0.411 -0.307 1.77 0.433 1.36
5 5 0.347 -1.57 -0.153 0.657 0.195 -0.915
6 6 -0.998 -0.145 0.265 -1.95 -0.733 -2.09
7 7 2.05 -0.0876 -0.909 -0.190 1.14 -0.278
8 8 0.0735 -0.134 -2.04 -0.832 -1.96 -0.966
9 9 1.52 2.37 1.53 -0.596 3.05 1.78
10 10 1.42 -0.753 -1.61 1.84 -0.194 1.09
How can I use map* and mutate to convert a list into a set of additional columns?
Using !!
and :=
lets you dynamically name columns. Then, we can reduce the list output of map()
with reduce()
, which left_joins() all the dataframes in the list using the dataset title and id columns.
df_2 <-
map(get_concept_list(df),
~ mutate(df,
!!.x := get_concept_info(df, .x))) %>%
reduce(left_join, by = c("dataset_title", "dataset_id"))
df_2
# A tibble: 1 x 6
dataset_title dataset_id time gender c_age measures
<chr> <chr> <list<df[,2]>> <list<df[,2]>> <list<df[,2]>> <list<df[,2]>>
1 Population estimates - local authority based by single year NM_2002_1 [28 x 2] [3 x 2] [121 x 2] [2 x 2]
How to create multiple columns from combination of pairs of another column?
We may use combn
on the levels
of the 'marital'
library(dplyr)
library(stringr)
data2 <- combn(levels(data$marital), 2, FUN = \(x)
data %>%
transmute(!! str_c(x, collapse="_") :=
case_when(as.character(marital) %in% x~ as.character(marital))),
simplify = FALSE) %>%
bind_cols(data, .)
-output
> head(data2, 2)
# A tibble: 2 × 24
year marital age race rincome partyid relig denom tvhours `No answer_Neve… `No answer_Sepa… `No answer_Divo… `No answer_Wido… `No answer_Marr… `Never married_…
<int> <fct> <int> <fct> <fct> <fct> <fct> <fct> <int> <chr> <chr> <chr> <chr> <chr> <chr>
1 2000 Divorced 48 White $8000 to 9999 Not st… Prot… Bapt… NA <NA> <NA> Divorced <NA> <NA> <NA>
2 2000 Divorced 25 White Not applicable Not st… None Not … 1 <NA> <NA> Divorced <NA> <NA> <NA>
# … with 9 more variables: Never married_Divorced <chr>, Never married_Widowed <chr>, Never married_Married <chr>, Separated_Divorced <chr>,
# Separated_Widowed <chr>, Separated_Married <chr>, Divorced_Widowed <chr>, Divorced_Married <chr>, Widowed_Married <chr>
purrr approach for creating new columns through function with two arguments
- You should not use
df %>% map2(...)
if you are passing.x
,.y
separately tomap2
. is.na(.x)
is not correct since.x
is character values (like"x"
,"y"
and"z"
). I have useddf[[.x]]
to subset the values.- Since we are not using
df %>% ...
sorow_number()
would not work, hence changed it toseq_along
.
Here is an approach with map2_dfc
to create new columns and we use bind_cols
to bind it to original dataframe.
library(dplyr)
library(purrr)
bind_cols(df, map2_dfc(.x = c("x", "y", "z"),
.y = sampling,
.f = ~tibble(!!paste0(.x, "_random") :=
if_else(is.na(df[[.x]]), NA_integer_,
as.integer(seq_along(df[[.x]]) %in% sample(which(!is.na(df[[.x]])), size = .y))))))
# x y z x_random y_random z_random
#1 NA 0.02358698 0.222022714 NA 0 1
#2 0.15099912 NA 0.878007560 0 NA 0
#3 0.20228598 0.92222805 NA 0 0 NA
#4 0.10955137 0.68713928 0.485866574 1 1 1
#5 0.57361508 0.56205208 0.367087414 1 1 0
#6 0.30534642 0.75997029 0.006055428 0 0 1
#7 0.76949447 0.78142772 0.279323093 0 0 0
#8 0.07178739 0.73181961 0.187739444 0 0 1
#9 0.52645525 0.48321814 0.213029355 0 1 0
#10 0.30858707 0.20973381 0.450931534 0 0 0
Using purrr::possibly to catch dynamic error messages
You could tweak purrr::possibly()
from its original code to return instead of message the error.
Original code:
## > possibly
## function (.f, otherwise, quiet = TRUE)
## {
## .f <- as_mapper(.f)
## force(otherwise)
## function(...) {
## tryCatch(.f(...), error = function(e) {
## if (!quiet)
## message("Error: ", e$message) ## <--- tweak
## otherwise
## }, interrupt = function(e) {
## stop("Terminated by user", call. = FALSE)
## })
## }
## }
tweaked function:
possibly2 <- function (.f, otherwise, quiet = TRUE) {
.f <- as_mapper(.f)
force(otherwise)
function(...) {
tryCatch(.f(...), error = function(e) {
if (!quiet)
return(e$message) ## <-- tweaked
otherwise
}, interrupt = function(e) {
stop("Terminated by user", call. = FALSE)
})
}
}
Example:
safer_foo <- possibly2(.f = foo, otherwise = "error",
quiet = FALSE ## don't forget to "unquiet"
)
## all other objects / code as in your example
Output:
## > output_list
## $A
## # A tibble: 1 x 1
## sum
## <int>
## 1 55
##
## $B
## [1] "No column with name B found."
##
## $C
## [1] "All values of column C are missing."
edit
Actually, possibly2
carries over code which is no longer needed. Omitting
the unwanted static arguments otherwise
and quiet
, and skipping the handler for user interrupts, the required code shrinks down to:
possibly2 <- function (.f) {
.f <- as_mapper(.f)
function(...) {
tryCatch(.f(...), error = function(e) e$message)
}
}
Related Topics
Dynamically Add Column Names to Data.Table When Aggregating
Plotting Pca Biplot with Ggplot2
Controlling Order of Facet_Grid/Facet_Wrap in Ggplot2
Converting a List of Data Frames into Individual Data Frames in R
How to Convert Integer into Categorical Data in R
How to Create a Grouped Boxplot in R
Add a Column with Count of Nas and Mean
Adding New Column with Diff() Function When There Is One Less Row in R
R Matrix to Rownames Colnames Values
Possible to Show Console Messages (Written with 'Message') in a Shiny Ui
Checking If Date Is Between Two Dates in R
How to Test If List Element Exists
Replace All Values in a Matrix <0.1 with 0
R: How to Get the Week Number of the Month
How to Implement a Cleanup Routine in R Shiny
Typeof Returns Integer for Something That Is Clearly a Factor
Repeat Vector When Its Length Is Not a Multiple of Desired Total Length