Convert R Dataframe from Long to Wide Format, But with Unequal Group Sizes, for Use with Qcc

Convert R dataframe from long to wide format, but with unequal group sizes, for use with qcc

You can create a sequence column ('.id') using getanID from splitstackshape and use dcast from data.table to convert the long format to wide format. The output of splitstackshape is a data.table. When we load splitstackshape, data.table will also be loaded. So, if you already have the devel version of data.table, then the dcast from data.table can be used as well.

library(splitstackshape)
dcast(getanID(df1, 'time'), time~.id, value.var='measure')
# time 1 2 3 4 5
#1: 2001 Q1 0.1468068 0.53593193 0.5609797 NA NA
#2: 2001 Q2 -1.4810269 0.18150972 NA NA NA
#3: 2001 Q3 1.7201815 -0.08480855 -2.2320888 -1.152691 0.5797502

Update

As @snoram mentioned in the comments, function rowid from data.table makes it easier to use just data.table alone

library(data.table)
dcast(setDT(df1), time ~ rowid(time), value.var = "measure")

Combine long-format data frames with different length and convert to wide format

Using data.table

library(data.table)
dcast(setDT(fd), id ~ paste0('x.time', time), value.var = 'x')

-output

   id x.time1 x.time2 x.time3 x.time4 x.time5
1: 1 0 0 0 0 0
2: 2 NA NA NA NA 1
3: 3 NA NA 0 NA NA
4: 4 NA 0 0 NA NA
5: 5 0 NA NA NA NA

Cannot accurately convert from long format to wide in r

We need to create a sequence column as there are duplicates

library(dplyr)
library(tidyr)
data_ige %>%
group_by(ID, date, test) %>%
mutate(rn = row_number()) %>%
ungroup %>%
spread(test, value) %>%
#or use pivot_wider as spread is getting deprecated
# pivot_wider(names_from = test, values_from = value) %>%
select(-rn)
# A tibble: 8 x 9
# ID date `1` `3` `4` `5` `6` `7` `8`
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 2008 0.035 NA NA NA NA NA NA
#2 A 2011 2.75 NA NA NA NA NA NA
#3 B 2011 9.99 3.65 0.68 0.02 0.17 0.5 NA
#4 C 2008 0 NA NA NA NA NA NA
#5 C 2011 NA NA NA NA NA NA 0.09
#6 D 2008 0 0 0 0 0 0.59 0
#7 D 2011 0 0.49 0.2 0.08 0.16 0.5 0.13
#8 D 2011 9.99 NA NA NA NA NA NA

data

data_ige <- structure(list(ID = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L), .Label = c("A", "B", "C", "D"), class = "factor"), date = c(2008,
2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2008, 2011, 2008,
2008, 2008, 2008, 2008, 2008, 2008, 2011, 2011, 2011, 2011, 2011,
2011, 2011), test = c(1, 1, 1, 3, 4, 5, 6, 7, 8, 1, 1, 1, 3,
4, 5, 6, 7, 8, 1, 3, 4, 5, 6, 7, 8), value = c(0.035, 2.75, 9.99,
3.65, 0.68, 0.02, 0.17, 0.5, 0.09, 0, 0, 0, 0, 0, 0, 0, 0.59,
0, 9.99, 0.49, 0.2, 0.08, 0.16, 0.5, 0.13)),
class = "data.frame", row.names = c(NA,
-25L))

How can you turn to a long, tidy format a dataframe with unequal number of columns?

You don't need to pivot here, just bind rows for each set of columns separately. You could manually do it just doing:

library(tidyverse)

bind_rows(
df[,1:3],
df[,c(1,4:5)],
df[,c(1,6:7)]
)

Then just filter out the rows with NA values. If you have additional columns to do it, you can instead use purrr::map_dfr on a numeric vector for column indexing to automatically select the correct columns and then bind them together. Then just use dplyr::filter(across(...) to drop the rows with all NA.

map_dfr(
seq(2,6,2),
~df[, c(1, .x, .x + 1)]
) %>%
filter(across(c(x,y), ~ !is.na(.x))) %>%
arrange(id, y, x)
#> # A tibble: 6 × 3
#> id x y
#> <chr> <dbl> <chr>
#> 1 T1 4 A
#> 2 T1 7 A
#> 3 T2 5 B
#> 4 T2 8 B
#> 5 T2 4 F
#> 6 T3 6 C

I added the final dplyr::arrange() call to match your output, you can adjust to how you actually want to order your data.

Reshape long to wide with dates - R

We can use dcast

library(data.table)
dcast(setDT(df), id~paste0("date.", rowid(id)), value.var = "date")
# id date.1 date.2
#1: 1 2015-01-03 2012-03-04
#2: 2 2016-07-21 2016-09-08

Or using tidyverse

library(dplyr)
library(tidyr)
df %>%
group_by(id) %>%
mutate(i1 = paste0("date.", row_number())) %>%
spread(i1, date)

dcast for huge dataframe [R]

The easy solution to this case turned out to be switching back to the old reshape package. Which means useing cast instead of dcast. Arun's comments are highly usable, providede one can actually update.
Related



Related Topics



Leave a reply



Submit