Using Pivot_Longer with Multiple Paired Columns in the Wide Dataset

Using pivot_longer with multiple paired columns in the wide dataset

You want to use .value in the names_to argument:

input %>%
pivot_longer(
-event,
names_to = c(".value", "item"),
names_sep = "_"
) %>%
select(-item)

# A tibble: 4 x 3
event url name
<int> <fct> <fct>
1 1 g1 dc
2 1 g2 sf
3 2 g3 nyc
4 2 g4 la

From this article on pivoting:

Note the special name .value: this tells pivot_longer() that that part of the column name specifies the “value” being measured (which will become a variable in the output).

Using pivot_longer with multiple column classes

We could use names_pattern after rearranging the substring in column names

library(dplyr)
library(tidyr)
library(stringr)
df_wide %>%
# rename the columns by rearranging the digits at the end
# "_(\\d+)(_.*)" - captures the digits (\\d+) after the _
# and the rest of the characters (_.*)
# replace with the backreference (\\2, \\1) of captured groups rearranged
rename_with(~ str_replace(., "_(\\d+)(_.*)", "\\2_\\1"), -resp_id) %>%
pivot_longer(cols = -resp_id, names_to = c( ".value", "question_number"),
names_pattern = "(.*)_(\\d+$)")

-output

# A tibble: 6 × 4
resp_id question_number question_info question_answer
<dbl> <chr> <chr> <dbl>
1 1 1 "What is your eye color?" 1
2 1 2 "What is your hair color?" 2
3 2 1 "Are you over 6 ft tall?" 1
4 2 2 "" NA
5 3 1 "What is your hair color?" 0
6 3 2 "Are you under 40?" 1

How do we transform a dataset in R using pivot_longer with multiple columns

Probably not the most elegant solution, but I was able to solve my own problem using the steps below:

a <- df %>% 
select(person,initial_event_date, type_initial) %>%
mutate(visit_type = 'initial')
b <- df %>%
filter(visit_prior == 'Y') %>%
select(person, initial_event_date, prior_visit_type, day_cnt_prior) %>%
mutate(visit_type = 'visit_prior',
day_cnt_prior = as.integer(day_cnt_prior))
c <- df %>% filter(visit_after == 'Y') %>%
select(person, initial_event_date, visit_after_type, day_cnt_after) %>%
mutate(visit_type = 'visit_after',
day_cnt_after = as.integer(day_cnt_after))

bind_rows(a,b,c) %>%
arrange(person) %>%
mutate(visit_reason = dplyr::coalesce(type_initial, prior_visit_type, visit_after_type),
visit_type = dplyr::coalesce(visit_type),
day_cnt = dplyr::coalesce(day_cnt_after, day_cnt_prior)) %>%
select(person, initial_event_date,visit_type, visit_reason, day_cnt) %>%
replace_na(list(day_cnt = 0))

Is there way to pivot_longer to multiple values columns in R?

We don't need multiple calls if we specify the names_to as a vector of values i.e. .value - returns the value of the columns and 'group' the column with the suffix of column names. Here, we use names_sep as . to split at the .

library(tidyr)
pivot_longer(df, cols = -ids, names_to = c(".value", "group"),
names_sep = "\\.")

-output

# A tibble: 4 × 4
ids group mean se
<chr> <chr> <int> <int>
1 protein1 group1 982 3
2 protein1 group2 657 7
3 protein2 group1 663 9
4 protein2 group2 215 1

NOTE: values are different as sample was used in creation of input data without a set.seed specified

pivot_longer for multiple sets having the same names_to

Edit Added values_drop_na = TRUE thanks to TarJae's comment.

You could use

library(dplyr)
library(tidyr)

df %>%
pivot_longer(-c(ID, State),
names_to = c("Time", ".value"),
names_pattern = "(Time\\d)(.*)",
values_drop_na = TRUE)

This returns

# A tibble: 9 x 5
ID State Time Day Month
<chr> <chr> <chr> <dbl> <dbl>
1 id-1 MD Time1 1 1
2 id-1 MD Time2 9 12
3 id-1 MD Time3 7 1
4 id-2 MD Time1 12 4
5 id-2 MD Time2 21 4
6 id-2 MD Time3 14 2
7 id-3 VA Time1 30 5
8 id-3 VA Time2 13 5

Gathering wide columns into multiple long columns using pivot_longer

I have found the answer to my question:

pivot_longer - transforms the columns in wide format starting with 'hf' and 'ac' to long format in separate columns

names_to parameters:

  • .value = contains metadata on the cell values that correspond to the original columns
  • these values are pivoted in long format and added in new columns "hf" and "ac"
  • column "group" has the original column endings (e.g. the numbers 1-6) pivoted to long format
  • names_pattern = regex argument specifying character "_" where column names are to be broken up
df3 <- df %>%
tidyr::pivot_longer(
cols = c(
starts_with("hf"),
starts_with("ac"),
starts_with("cs"),
starts_with("se")
),
names_to = c(".value", "level"),
names_pattern = "(.*)_(.*)"
)


Related Topics



Leave a reply



Submit