Reshaping Time Series Data from Wide to Tall Format (For Plotting)

Reshaping time series data from wide to tall format (for plotting)

you can also use melt() from the 'reshape' library (I think it's easier to use than reshape() itself) - that'll save you the extra step of having to add the time column back in...

> library(reshape)
> m <- melt(dat,id="t",variable_name="symbol")
> names(m) <- sub("value","price",names(m))
> head(m)
t symbol price
1 2009-01-01 X -1.14945096
2 2009-01-02 X -0.07619870
3 2009-01-03 X 0.01547395
4 2009-01-04 X -0.31493143
5 2009-01-05 X 1.26985167
6 2009-01-06 X 1.31492397
> class(m$t)
[1] "Date"
> library(lattice)
> xyplot( price ~ t | symbol, data=m ,type ="l", layout = c(1,3) )

For this particular task, however, I would consider using the 'zoo' library, which would not require you to reshape the data frame:

> library(zoo)                                                                  
> zobj <- zoo(dat[,-1],dat[,1])
> plot(zobj,col=rainbow(ncol(zobj)))

R developers/contributors (Gabor and Hadley in this case) have blessed us with many great choices. (and can't forget Deepayan for the lattice package)

Convert /reshape a dataset from 'wide to long' format and convert the time column into time format for time-series analysis

I made a few small edits to your sample dataframe based on my comment above:

import pandas as pd 

df = pd.DataFrame({'level':['A','B','C','D','E'],
'Time_30':[1993.05,1999.45, 2001.11, 2007.39, 2219.77],
'Time_60':[2123.15,2299.59, 2339.19, 2443.37, 2553.15],
'Time_90':[2323.56,2495.99,2499.13, 2548.71, 2656.0],
'Time_120':[2355.52,2491.19,2519.92,2611.81, 2753.11],
'Time_150':[2425.31,2599.51, 2539.9, 2713.77, 2893.58],
'Time_180':[2443.35,2609.92, 2632.49, 2774.03, 2901.25]} )

First, manipulate the Time_* column names to be integer values:

timecols = [int(c.replace("Time_","")) for c in df.columns if c != 'level']
df.columns = ['level'] + timecols

After that you can pd.melt() like you were thinking, yielding a datarame with all those "series" you mentioned above concatenated together:

df1 = df.melt(id_vars=['level'], value_vars=timecols, var_name='time', value_name='count').sort_values(['level','time']).reset_index(drop=True)

print(df1.head(10))
level time count
0 A 30 1993.05
1 A 60 2123.15
2 A 90 2323.56
3 A 120 2355.52
4 A 150 2425.31
5 A 180 2443.35
6 B 30 1999.45
7 B 60 2299.59
8 B 90 2495.99
9 B 120 2491.19

If you want to loop over the levels, select them with:

for level in df1['level'].unique():
tmp = df1[df1['level']==level]

or

for level in df1['level'].unique():
tmp = df1[df1['level']==level].copy()

...if you intend to modify/add data to the tmp dataframe.

As for making timestamps, you could do:

df1['time'] = pd.to_timedelta(df1['time'], unit='min')

...like you were attempting, but it depends on how you're using it. If you just want strings that look like "00:30:00", etc, you can try something like:

df1['time'] = pd.to_timedelta(df1['time'], unit='min').apply(lambda x:str(x)[-8:])

Anyway, hope that gets you on track for what you need.

Reshaping data.frame from wide to long format

reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:

reshape(d, 
direction = "long",
varying = list(names(d)[3:7]),
v.names = "Value",
idvar = c("Code", "Country"),
timevar = "Year",
times = 1950:1954)

Reshape data table from wide to long with transpose

A better approach is to use the new pivot_londer and pivot_wider functions from the tidyr package.

Easier convention to use and has convenient text manipulation options built in. In this case removing the "X." that was added to column names.

df <- read.table(header=TRUE, text="Mill   Acid `1_day`  `3_days` `1_week` `2_weeks` `4_weeks` `2_months` `3_months` `6-7_months`
Gävle 0 10.5 12.0 10.9 10.7 10.6 10.1 10 9.81
Gävle 0.5 8.79 10 9.29 9.08 9.39 9.13 9.14 8.86
Gävle 0.75 8.05 8.95 8.33 8.26 8.24 8.22 8.25 7.44
Gävle 1 6.7 7.82 7.77 8.02 8.19 7.79 7.97 6.99
Gävle 1.25 6.52 7.43 7.33 7.11 7.72 7.88 7.91 6.96
Gävle 1.5 6.41 7.25 7.28 6.92 7.63 7.01 7.64 6.7
Obbola 0 10.5 12.0 10.9 10.7 10.6 10.1 10 9.81
Obbola 0.5 8.79 10 9.29 9.08 9.39 9.13 9.14 8.86
Obbola 0.75 8.05 8.95 8.33 8.26 8.24 8.22 8.25 7.44
Obbola 1 6.7 7.82 7.77 8.02 8.19 7.79 7.97 6.99
Obbola 1.25 6.52 7.43 7.33 7.11 7.72 7.88 7.91 6.96
Obbola 1.5 6.41 7.25 7.28 6.92 7.63 7.01 7.64 6.7 ")

library(tidyr)

longdf <- df %>% pivot_longer(-c("Mill", "Acid"), names_to="Time", values_to = "value", names_prefix="X.")

answer <-longdf %>% pivot_wider(id_cols= c("Time", "Acid" ), names_from = "Mill" )

Reshape long form panel data to wide stacked time series

This is a perfect time to try out pivot_table

    user_id  order_date    values
0 11039591 2017-01-01 3277.466
1 11039591 2017-01-02 587.553
2 13629086 2017-01-03 501.882
3 13629086 2017-01-02 1352.546
4 6084613 2017-01-01 441.151

df.pivot_table(index='user_id',columns='order_date',values='values')

Output

order_date  2017-01-01  2017-01-02  2017-01-03
user_id
6084613 441.151 NaN NaN
11039591 3277.466 587.553 NaN
13629086 NaN 1352.546 501.882

How do you convert a wide time series data frame into a long data frame?

you have to specify the id.vars as second argument in the melt function and it works:

require(reshape)
df <- data.frame(date = as.Date(c("1992-03-23", "1992-03-24")),
item_1 = c(8.63, 7.98),
item_2 = c(7.609, 7.634),
item_3 = c(1.6546, 1.6533))

melt(df, "date")

you will get:

        date variable  value
1 1992-03-23 item_1 8.6300
2 1992-03-24 item_1 7.9800
3 1992-03-23 item_2 7.6090
4 1992-03-24 item_2 7.6340
5 1992-03-23 item_3 1.6546
6 1992-03-24 item_3 1.6533

hope this helps

Reshaping the HRS data from wide to long format and creating a time variable

An option using data.table that makes use of the measure/patterns would be to use melt. In the example, the column names have common patterns as 'weight', 'height' which we specify it in measure parameter to convert it to 'long' format, then extract the numeric part with sub to create 'Ind'

library(data.table)
melt(setDT(df), measure = patterns("weight", "height"), value.name = c("W", "H"),
variable.name = "time")[,
Ind := as.integer(sub("\\D+", "", data))][order(Ind)][, .(time, Ind, W, H)]
# time Ind W H
# 1: 1 1 56 151
# 2: 2 1 57 154
# 3: 3 1 56 NA
# 4: 4 1 56 153
# 5: 5 1 55 152
# 6: 1 2 76 163
# 7: 2 2 75 164
# 8: 3 2 76 165
# 9: 4 2 73 162
#10: 5 2 77 161
#11: 1 3 87 173
#12: 2 3 88 NA
#13: 3 3 87 NA
#14: 4 3 85 172
#15: 5 3 84 171
#16: 1 4 64 153
#17: 2 4 66 154
#18: 3 4 65 152
#19: 4 4 63 154
#20: 5 4 65 154

Difficulty with Long to Wide to Long Format in R

An option is to convert to matrix and then melt with the na.rm = TRUE option

library(reshape2)
melt(as.matrix(r.data_lower), na.rm = TRUE)

Reshaping data in R (wide - Long)

We can extract the patterns from the column names using sub, split the sequence of that vector with 'nm1', use that as measure in melt to convert from 'wide' to 'long' format.

library(data.table)
nm1 <- sub("\\d+[[:alpha:]]+$", '', names(df1)[-(1:2)])
lst <- split(seq_along(nm1)+2, nm1)
melt(setDT(df1), measure = lst,
value.name= names(lst), variable.name= 'time')[order(ID)]
# ID Group time M1a M1r M2a M2r
#1: 1 1 1 0.2 200 0.3 300
#2: 1 1 2 0.4 400 0.6 600
#3: 2 1 1 0.3 300 0.4 400
#4: 2 1 2 0.6 600 0.6 600

data

df1 <- structure(list(ID = 1:2, Group = c(1L, 1L),
M1a2hB = c(0.2, 0.3
), M1a3hB = c(0.4, 0.6), M2a2hB = c(0.3, 0.4),
M2a3hB = c(0.6,
0.6), M1r2hB = c(200L, 300L), M1r3hB = c(400L, 600L),
M2r2hB = c(300L,
400L), M2r3hB = c(600L, 600L)), .Names = c("ID", "Group", "M1a2hB",
"M1a3hB", "M2a2hB", "M2a3hB", "M1r2hB", "M1r3hB",
"M2r2hB", "M2r3hB"
), class = "data.frame", row.names = c(NA, -2L))


Related Topics



Leave a reply



Submit