pivot_longer with multiple classes causes error ( No common type )
We can specify the values_ptype
in this case (as the value columns differ in types)
library(ggplot2)
library(tidyr)
library(dplyr)
small_diamonds %>%
pivot_longer( - row_num,
names_to = "key",
values_to = "val", values_ptypes = list(val = 'character'))
# A tibble: 161,820 x 3
# row_num key val
# <int> <chr> <chr>
# 1 1 cut Ideal
# 2 1 color E
# 3 1 price 326
# 4 2 cut Premium
# 5 2 color E
# 6 2 price 326
# 7 3 cut Good
# 8 3 color E
# 9 3 price 327
#10 4 cut Premium
# … with 161,810 more rows
pivot_longer error: impossible to combine different classes?
This might be more straightforward in base R, where you may simply t
ranspose the data frame. Because it coerces into a matrix just convert back as.data.frame
.
as.data.frame(t(df))
# V1
# a a
# b 1
# col3 c
# col4 d
Use unname
to get cleaner row names.
as.data.frame(unname(t(df)))
# V1
# 1 a
# 2 1
# 3 c
# 4 d
If you want the names as an extra variable, do this:
data.frame(V1=names(df), V2=unname(t(df)))
# V1 V2
# 1 a a
# 2 b 1
# 3 col3 c
# 4 col4 d
Pivot_longer error (cannot combine due to different classes)
Perhaps this will help:
library(tidyverse)
df <- read.table(text = " Input Rtype Rcost Rsolutions Btime Bcost
1 12-proc. typea 36 614425 40 36
2 15-proc. typeb 51 534037 50 51
3 18-proc typec 62 1843820 66 66
4 20-proc typea 68 1645581 104400 73
5 20-proc(l) typeb 64 1658509 14400 65
6 21-proc typec 78 3923623 453600 82",
header = TRUE,sep = "")
dfm <- pivot_longer(df, -c(Input, Rtype), names_to="variable", values_to="value")
dfm
#> # A tibble: 24 × 4
#> Input Rtype variable value
#> <chr> <chr> <chr> <int>
#> 1 12-proc. typea Rcost 36
#> 2 12-proc. typea Rsolutions 614425
#> 3 12-proc. typea Btime 40
#> 4 12-proc. typea Bcost 36
#> 5 15-proc. typeb Rcost 51
#> 6 15-proc. typeb Rsolutions 534037
#> 7 15-proc. typeb Btime 50
#> 8 15-proc. typeb Bcost 51
#> 9 18-proc typec Rcost 62
#> 10 18-proc typec Rsolutions 1843820
#> # … with 14 more rows
# If you have all 18 factor levels in your data:
ggplot(dfm, aes(x = interaction(Input, Rtype, sep = " "),
y = value, fill = variable)) +
geom_bar(stat = "identity") +
scale_y_log10(labels = scales::dollar,
name = "log10(cost)") +
scale_fill_viridis_d(end = 0.8) +
theme(axis.title.x = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0.5))
# If you don't have all 18 factor levels in your data:
all_combinations <- expand_grid(Input = dfm$Input,
Rtype = dfm$Rtype) %>%
distinct()
dfm_expanded <- left_join(all_combinations, dfm) %>%
replace_na(list("0"))
#> Joining, by = c("Input", "Rtype")
ggplot(dfm_expanded, aes(x = interaction(Input, Rtype, sep = " "),
y = value, fill = variable)) +
geom_bar(stat = "identity") +
scale_y_log10(labels = scales::dollar,
name = "log10(cost)") +
scale_fill_viridis_d(end = 0.8) +
theme(axis.title.x = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0.5))
#> Warning: Removed 12 rows containing missing values (position_stack).
Created on 2022-04-04 by the reprex package (v2.0.1)
How to pivot a data set longer when columns have no common type
The columns you select in pivot_longer
have no common type, i.e. cat1
and cat2
are numerics and cat3
is character. You can convert them all to characters in advance, or use the argument values_ptypes
to specify the type.
df1 %>%
pivot_longer(cat1:cat3,
names_to = 'variables', values_to = 'Values',
values_drop_na = TRUE,
values_ptypes = list(Values = character()))
# # A tibble: 9 x 5
# column_label val1 val2 variables Values
# <dbl> <dbl> <dbl> <chr> <chr>
# 1 2 0.989 9.89 cat1 0
# 2 2 0.622 6.22 cat1 1
# 3 3 0.619 6.19 cat2 0
# 4 3 0.119 1.19 cat2 1
# 5 10 0.407 4.07 cat3 BABY BOOMERS
# 6 10 0.800 8.00 cat3 GEN Z
# 7 10 0.305 3.05 cat3 GENERATION X
# 8 10 0.158 1.58 cat3 MILLENNIALS
# 9 10 0.439 4.39 cat3 SILENT GENERATION
pivot_longer: values_ptypes: can't convert integer to character
pivot_longer
needs the columns to be reshaped to have the same type. Here 'Type1' and 'Type2' are different in class
. Change it to a single class by converting to character
in values_transform
. According to ?pivot_longer
names_ptypes, values_ptypes - A list of column name-prototype pairs. A prototype (or ptype for short) is a zero-length vector (like integer() or numeric()) that defines the type, class, and attributes of a vector. Use these arguments if you want to confirm that the created columns are the types that you expect. Note that if you want to change (instead of confirm) the types of specific columns, you should use names_transform or values_transform instead.
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(
cols = contains("Type"),
names_to = "key",
values_to = "val",
values_transform = list(val = as.character))
-output
# A tibble: 10 × 4
Value Median key val
<int> <dbl> <chr> <chr>
1 1 1 Type1 A
2 1 1 Type2 1
3 2 1.5 Type1 A
4 2 1.5 Type2 2
5 1 1.5 Type1 A
6 1 1.5 Type2 2
7 NA NA Type1 AB
8 NA NA Type2 1
9 NA NA Type1 AB
10 NA NA Type2 1
pivot_longer
calls pivot_longer_spec
and within the function the line below generates the error
Browse[2]>
debug: out <- vec_c(!!!val_cols, .ptype = val_type)
Browse[2]>
Error: Can't convert <integer> to <character>.
Run `rlang::last_error()` to see where the error occurred.
Using pivot_longer with multiple column classes
We could use names_pattern
after rearranging the substring in column names
library(dplyr)
library(tidyr)
library(stringr)
df_wide %>%
# rename the columns by rearranging the digits at the end
# "_(\\d+)(_.*)" - captures the digits (\\d+) after the _
# and the rest of the characters (_.*)
# replace with the backreference (\\2, \\1) of captured groups rearranged
rename_with(~ str_replace(., "_(\\d+)(_.*)", "\\2_\\1"), -resp_id) %>%
pivot_longer(cols = -resp_id, names_to = c( ".value", "question_number"),
names_pattern = "(.*)_(\\d+$)")
-output
# A tibble: 6 × 4
resp_id question_number question_info question_answer
<dbl> <chr> <chr> <dbl>
1 1 1 "What is your eye color?" 1
2 1 2 "What is your hair color?" 2
3 2 1 "Are you over 6 ft tall?" 1
4 2 2 "" NA
5 3 1 "What is your hair color?" 0
6 3 2 "Are you under 40?" 1
pivot_longer multiple variables of different kinds
In this case one has to use names_to
combined with names_pattern
:
library(dplyr)
library(tidyr)
> head(x,3)
case X1990 flag.1990 X2000 flag.2000
1 1 0.2772497942 a 0.1751129 c
2 2 0.0005183129 b 0.4407503 d
3 3 0.5106083730 a 0.9071830 c
> x %>%
pivot_longer(cols = -case,
names_to = c(".value", "year"),
names_pattern = "([^\\.]*)\\.*(\\d{4})")
# A tibble: 20 x 4
case year X flag
<int> <chr> <dbl> <chr>
1 1 1990 0.277 a
2 1 2000 0.175 c
3 2 1990 0.000518 b
4 2 2000 0.441 d
5 3 1990 0.511 a
6 3 2000 0.907 c
7 4 1990 0.0140 b
8 4 2000 0.851 d
9 5 1990 0.0647 a
10 5 2000 0.734 c
11 6 1990 0.955 b
12 6 2000 0.574 d
13 7 1990 0.0865 a
14 7 2000 0.482 c
15 8 1990 0.290 b
16 8 2000 0.331 d
17 9 1990 0.881 a
18 9 2000 0.158 c
19 10 1990 0.123 b
20 10 2000 0.480 d
How to adjust specific error in pivot_longer() in R
You may try this inner_join
method -
library(dplyr)
df1 %>%
inner_join(med, by = c('Code', 'Week')) %>%
mutate(across(DR1:DR04, ~.x + get(paste0(cur_column(), '_PV')),
.names = '{col}_{col}_PV')) %>%
select(date1:Week, DR1_DR1_PV:DR04_DR04_PV)
# date1 date2 Code Week DR1_DR1_PV DR01_DR01_PV DR02_DR02_PV DR03_DR03_PV DR04_DR04_PV
#1 2021-06-28 2021-04-02 ABC Friday 11 11 11 11 11
#2 2021-06-28 2021-04-02 CDE Friday 17 17 17 17 17
#3 2021-06-28 2021-04-08 ABC Thursday 14 14 14 14 14
#4 2021-06-28 2021-04-08 CDE Thursday 13 13 13 13 13
Another method which would work for dynamic columns would be -
df3 <- df1 %>% inner_join(med, by = c('Code', 'Week'))
cols <- grep('DR', names(df1), value = TRUE)
new_cols <- paste(cols, cols, 'PV', sep = '_')
df3[new_cols] <- df1[cols] + df3[paste0(cols, '_PV')]
df3 %>% select(date1:Week, all_of(new_cols))
Error in `pivot_longer()` argument when updating `tibble` versions
While I don't know why the change occurred, I would guess that, at heart, this error has to do with your dat_2
tibble not being tidy. It's repetitive to include the difference between "mfi" and "dil" both as its own column antigen_dil
and as two separate columns mfi
and dil
.
Depending on what your data mean, two formats that work easily with pivot_longer
are:
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', '.value'),
names_pattern = '(.+)_(.+)'
)
#> # A tibble: 4 x 5
#> original_id timepoint antigen mfi dil
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 id_005 C_0 msp3 10.5 400
#> 2 id_005 C_0 pf_aarp 22.2 400
#> 3 id_005 D10 msp3 8.5 400
#> 4 id_005 D10 pf_aarp 10.2 400
or
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', 'antigen_dil'),
names_pattern = '(.+)_(.+)'
)
#> # A tibble: 8 x 5
#> original_id timepoint antigen antigen_dil value
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 id_005 C_0 msp3 mfi 10.5
#> 2 id_005 C_0 msp3 dil 400
#> 3 id_005 C_0 pf_aarp mfi 22.2
#> 4 id_005 C_0 pf_aarp dil 400
#> 5 id_005 D10 msp3 mfi 8.5
#> 6 id_005 D10 msp3 dil 400
#> 7 id_005 D10 pf_aarp mfi 10.2
#> 8 id_005 D10 pf_aarp dil 400
If you really need the tibble in the format you described, you can use:
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', 'antigen_dil'),
names_pattern = '(.+)_(.+)'
) %>%
mutate(
mfi = if_else(antigen_dil == "mfi", value, NA_real_),
dil = if_else(antigen_dil == "dil", value, NA_real_)
) %>%
select(-value)
#> # A tibble: 8 x 6
#> original_id timepoint antigen antigen_dil mfi dil
#> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 id_005 C_0 msp3 mfi 10.5 NA
#> 2 id_005 C_0 msp3 dil NA 400
#> 3 id_005 C_0 pf_aarp mfi 22.2 NA
#> 4 id_005 C_0 pf_aarp dil NA 400
#> 5 id_005 D10 msp3 mfi 8.5 NA
#> 6 id_005 D10 msp3 dil NA 400
#> 7 id_005 D10 pf_aarp mfi 10.2 NA
#> 8 id_005 D10 pf_aarp dil NA 400
The above snippets use the following to create dat_1
:
library(tidyverse)
dat_1 <-
tribble(
~original_id, ~timepoint, ~msp3_mfi, ~msp3_dil, ~pf_aarp_mfi, ~pf_aarp_dil,
"id_005", "C_0", 10.5, 400, 22.2, 400,
"id_005", "D10", 8.5, 400, 10.25, 400
)
dat_1
#> # A tibble: 2 x 6
#> original_id timepoint msp3_mfi msp3_dil pf_aarp_mfi pf_aarp_dil
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 id_005 C_0 10.5 400 22.2 400
#> 2 id_005 D10 8.5 400 10.2 400
Related Topics
Finding Overlap in Ranges with R
An Na in Subsetting a Data.Frame Does Something Unexpected
Using Un-Exported Function from Another R Package
Make Sequential Numeric Column Names Prefixed with a Letter
Calculate Number of Days Between Two Dates in R
Efficient Calculation of Matrix Cumulative Standard Deviation in R
Xpath and Namespace Specification for Xml Documents with an Explicit Default Namespace
How to Scrape/Automatically Download PDF Files from a Document Search Web Interface in R
How to Use the 'Sweep' Function
R Command Line Passing a Filename to Script in Arguments (Windows)
Add Values to a Reactive Table in Shiny
How Achieve Identical Facet Sizes and Scales in Several Multi-Facet Ggplot2 Graphics
R: Ggplot Stacked Bar Chart with Counts on Y Axis But Percentage as Label
How to Generalize Outer to N Dimensions
Mutate Multiple Variable to Create Multiple New Variables
Time Series Plot Gets Offset by 2 Hours If Scale_X_Datetime Is Used
How to Use Cast or Another Function to Create a Binary Table in R