How can I spread repeated measures of multiple variables into wide format?
Edit: I'm updating this answer since pivot_wider has been around for a while now and addresses the issue in this question and comments. You can now do
pivot_wider(
dat,
id_cols = 'Person',
names_from = 'Time',
values_from = c('Score1', 'Score2', 'Score3'),
names_glue = '{Time}.{.value}'
)
to get the desired result.
The original answer was
dat %>%
gather(temp, score, starts_with("Score")) %>%
unite(temp1, Time, temp, sep = ".") %>%
spread(temp1, score)
Converting many variables from long to wide in R
You can create a unique identifier row for every family
, id
and time
variable and then use pivot_wider
.
library(dplyr)
D %>%
group_by(family, id, time) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = time, values_from = c(x, y))
Convert wide data to long format for repeated measures/mixed models
dplyr
tidyr::pivot_longer(wide, -Participant, names_to = "Item", values_to = "Accuracy")
# # A tibble: 9 x 3
# Participant Item Accuracy
# <chr> <chr> <int>
# 1 P1 Banana 0
# 2 P1 Apple 0
# 3 P1 Orange 0
# 4 P2 Banana 1
# 5 P2 Apple 1
# 6 P2 Orange 1
# 7 P3 Banana 0
# 8 P3 Apple 0
# 9 P3 Orange 0
NA values and extra rows when spreading repeated measures of multiple variables into wide format?
You are almost there. The lat
and long
go into different rows because their IndYear
is different. As you only keep the first value of IndYear
for each IndiDII
in the final data.frame
, add IndYear = first(IndYear)
will give you the desired result.
Dat %>%
group_by(IndIDII) %>%
mutate(YearNum = row_number(), IndYear = first(IndYear)) %>%
gather(Group, LatLong, c(WintLat, WintLong)) %>%
unite(GroupNew, YearNum, Group, sep = "-") %>%
spread(GroupNew, LatLong) %>%
as.data.frame()
# IndIDII IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
# 1 BHS_265 BHS_265-2015 47.61025 -112.7210 47.59884 -112.7089 NA NA NA NA
# 2 BHS_377 BHS_377-2015 43.34744 -109.4821 43.35559 -109.4445 43.35195 -109.4566 43.34765 -109.4892
# 3 BHS_770 BHS_770-2016 42.97379 -109.0400 42.97129 -109.0367 42.97244 -109.0509 NA NA
wide to long multiple measures each time
This is pretty close and changing the names of columns should be within your skillset:
reshape(DF,
varying=c(work= c(3, 7), play= c(4,8), talk= c(5,9), total= c(6,10) ),
direction="long")
EDIT: Adding a version that is almost an exact solution:
reshape(DF, varying=list(work= c(3, 7), play= c(4,8), talk= c(5,9), total= c(6,10) ),
v.names=c("Work", "Play", "Talk", "Total"),
# that was needed after changed 'varying' arg to a list to allow 'times'
direction="long",
times=1:2, # substitutes number for T1 and T2
timevar="times") # to name the time col
Reshaping by ID number into wide format
Not so hard with tidyverse...
df<-data.frame(ID=c(100,101,101,101,102,103),
DEGREE=c("BA","BA","MS","PHD","BA","BA"),
YEAR=c(1980,1990, 1992, 1996, 2000, 2004),
stringsAsFactors=FALSE)
df1 <- df %>% select(-3) %>% group_by(ID) %>% mutate(i=row_number()) %>%
as.data.frame() %>%
reshape(direction="wide",idvar="ID",v.names="DEGREE",timevar="i",sep="_")
df1[is.na(df1)] <- ""
df2 <- df %>% select(-2) %>% group_by(ID) %>% mutate(i=row_number()) %>%
as.data.frame() %>%
reshape(direction="wide",idvar="ID",v.names="YEAR",timevar="i",sep="_")
df2[is.na(df2)] <- ""
inner_join(df1,df2,"ID")
# ID DEGREE_1 DEGREE_2 DEGREE_3 YEAR_1 YEAR_2 YEAR_3
#1 100 BA 1980
#2 101 BA MS PHD 1990 1992 1996
#3 102 BA 2000
#4 103 BA 2004
wide to long, multiple variables R
You can use pivot_longer
:
tidyr::pivot_longer(wide,
cols = large_firm:large_hi,
names_to = c('firm_size', '.value'),
names_sep = '_')
# id sex age group firm_size firm hi
# <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
#1 1 2 25 non_unique large 1 1
#2 1 2 25 non_unique small 3 6
#3 2 1 33 unique large 6 2
#4 2 1 33 unique small 1 4
To get your exact output shown you can use :
library(dplyr)
tidyr::pivot_longer(wide,
cols = large_firm:large_hi,
names_to = c('firm_size', '.value'),
names_sep = '_') %>%
mutate(hi_group = firm_size,
firm_size = paste(firm_size, 'firm', sep = '_')) %>%
rename(firm_preference_score = firm, hi_score = hi)
grouping table by multiple factors and spreading it from long format to wide - the data.table way in R
In data.table
, we can group by multiple columns and to reshape we can use dcast
.
library(data.table)
dcast(mtcars[, .N, .(carb, cyl, gear)], carb+cyl~gear, value.var = "N")
# carb cyl 3 4 5
#1: 1 4 1 4 NA
#2: 1 6 2 NA NA
#3: 2 4 NA 4 2
#4: 2 8 4 NA NA
#5: 3 8 3 NA NA
#6: 4 6 NA 4 NA
#7: 4 8 5 NA 1
#8: 6 6 NA NA 1
#9: 8 8 NA NA 1
You may use fill
argument in dcast
to replace NA
s with 0 or any other number.
Related Topics
Merge 2 Data Frames in a Loop for Each Column in One of Them
How to Find the Largest N Elements in a List in R
How to Write Ifelse Statement With Multiple Conditions in R
Count Number of Rows Per Group and Add Result to Original Data Frame
How to Identify/Delete Non-Utf-8 Characters in R
How to Make a Great R Reproducible Example
Reshaping Data.Frame from Wide to Long Format
How to Make a List of Data Frames
Grouping Functions (Tapply, By, Aggregate) and the *Apply Family
Reshaping Multiple Sets of Measurement Columns (Wide Format) into Single Columns (Long Format)
Sort (Order) Data Frame Rows by Multiple Columns
How to Do Vlookup and Fill Down (Like in Excel) in R
How to Succinctly Write a Formula With Many Variables from a Data Frame
How to Find Common Elements from Multiple Vectors
What Does "The Following Object Is Masked from 'Package:Xxx'" Mean
How to Save Warnings and Errors as Output from a Function
Conditionally Replace Values of Subset of Rows With Column Name in R Using Only Tidy