How to Calculate Returns from a Vector of Prices

How to calculate returns from a vector of prices?

Using your sample data, I think you mean the following:

a <- c(10.25, 11.26, 14, 13.56) 
> diff(a)/a[-length(a)]
[1] 0.09853659 0.24333925 -0.03142857

diff returns the vector of lagged differences and a[-length(a)] drops the last element of a.

Sql = how to calculate the return vector with a time series of prices?

In Oracle, you could use the lag analytic function:

select 
price / (lag(price) over (order by i))
, ...
from PriceHistory

Here, lag(price) over (order by i) returns the price of the previous row, in a set ordered by the i column.

Calculate Returns over Period of Time

I suggest switching to a time series class, like xts or zoo. But if you just want to get it done, and learn more later, you can do it pretty easily as a data frame. Note that I have to pad the return vectors with NAs to make it line up correctly and that a hold of 20 really buy on 1 and sells on 1 + 20:

> library(xts) 
> set.seed(2001)
> n <- 50
> hold <- 20
> price <- rep(55, n)
> walk <- rnorm(n)
> for (i in 2:n) price[i] <- price[i-1] + walk[i]
> data <- data.frame(date=as.Date("2001-05-25") + seq(n), price=price)
> data <- transform(data, return=c(diff(log(price), lag=hold), rep(NA, hold)))

If you're ready for xts or zoo (this should work in either), then I suggest using rollapply to get the forward look (assuming you want the forward looking return, which makes it a lot easier to form portfolios today and see how it works into the future):

> data.xts <- xts(data[, -1], data[, 1])
> f <- function(x) log(tail(x, 1)) - log(head(x, 1))
> data.xts$returns.xts <- rollapply(data.xts$price, FUN=f, width=hold+1, align="left", na.pad=T)

The two approaches are the same:

> head(data.xts, hold+2)
price return returns.xts
[1,] 55.00000 0.026746496 0.026746496
[2,] 54.22219 0.029114744 0.029114744
[3,] 53.19811 0.047663206 0.047663206
[4,] 53.50088 0.046470723 0.046470723
[5,] 53.85202 0.041843116 0.041843116
[6,] 54.75061 0.018464467 0.018464467
[7,] 55.52704 -0.001105607 -0.001105607
[8,] 56.15930 -0.024183803 -0.024183803
[9,] 56.61779 -0.010757559 -0.010757559
[10,] 55.51042 0.005494771 0.005494771
[11,] 55.17217 0.044864991 0.044864991
[12,] 56.07005 0.025411005 0.025411005
[13,] 55.47287 0.052408720 0.052408720
[14,] 56.10754 0.034089602 0.034089602
[15,] 56.35584 0.075726190 0.075726190
[16,] 56.40290 0.072824657 0.072824657
[17,] 56.05761 0.070589032 0.070589032
[18,] 55.93916 0.069936575 0.069936575
[19,] 56.50367 0.081570964 0.081570964
[20,] 56.12105 0.116041931 0.116041931
[21,] 56.49091 0.095520517 0.095520517
[22,] 55.82406 0.137245367 0.137245367

How to calculate return in data.table?

The reason you are getting that output is because Prices[, names(Prices) != "Date"] returns a logical vector:

> Prices[, names(Prices) != "Date"]
[1] FALSE TRUE TRUE TRUE

And because you can do calculations with logicals, you can also use diff on a logical vector. FALSE is then treated as a 0 and TRUE as a 1. So basically you were doing diff(c(0,1,1,1)).


A possible solution for what you want:

cols <- setdiff(names(Prices),"Date")

# option 1:
Prices[, paste0(cols,"_return") := lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA)), .SDcols = cols][]

# option 2:
Prices[, paste0(cols,"_return") := lapply(.SD, function(x) c(NA,diff(x))/shift(x, fill = NA)), .SDcols = cols][]

which gives:

> Prices
Date C1 C2 C3 C1_return C2_return C3_return
1: 1985-01-31 NA 47 NA NA NA NA
2: 1985-02-28 NA 45 NA NA -0.04255319 NA
3: 1985-03-29 130 56 NA NA 0.24444444 NA
4: 1985-04-30 140 67 NA 0.07692308 0.19642857 NA
5: 1985-05-31 150 48 93 0.07142857 -0.28358209 NA
6: 1985-06-28 160 79 96 0.06666667 0.64583333 0.03225806
7: 1985-07-31 160 56 94 0.00000000 -0.29113924 -0.02083333
8: 1985-08-30 160 77 93 0.00000000 0.37500000 -0.01063830
9: 1985-09-30 160 66 93 0.00000000 -0.14285714 0.00000000
10: 1985-10-31 160 44 93 0.00000000 -0.33333333 0.00000000
11: 1985-11-29 160 55 93 0.00000000 0.25000000 0.00000000

If you want to create a new data.table, you could use one of the following two options:

# option 1:
Returns <- Prices[, c(list(Date = Date), lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA))), .SDcols = cols]

# option 2:
Returns <- copy(Prices)
Returns[, (cols) := lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA)), .SDcols = cols]

Used data:

Prices <- fread("Date        C1  C2  C3
31.01.1985 NA 47 NA
28.02.1985 NA 45 NA
29.03.1985 130 56 NA
30.04.1985 140 67 NA
31.05.1985 150 48 93
28.06.1985 160 79 96
31.07.1985 160 56 94
30.08.1985 160 77 93
30.09.1985 160 66 93
31.10.1985 160 44 93
29.11.1985 160 55 93")[, Date := as.Date(Date, "%d.%m.%Y")]

Calculate returns on a daily basis in R

Try this:

cbind(mydf[-1,1],apply(mydf[,-1],2,function(x) diff(x)/head(x,-1)))

Output:

       Date         AMZN          GOOG          WFM         MSFT
1: 5/1/2016 -0.005023646 0.0009975062 0.011421761 0.004562044
2: 6/1/2016 -0.001798631 0.0014004928 -0.006537949 -0.018165305
3: 7/1/2016 -0.039057964 -0.0231704098 -0.027819324 -0.034782628
4: 8/1/2016 -0.001463983 -0.0164099778 -0.016000000 0.003066973

Prices returns calculation in a df with many tickers with dplyr

In dplyr, we can use lag to get previous Prices

library(dplyr)
df %>%
group_by(Tickers) %>%
mutate(returns = (Prices - lag(Prices))/Prices)

# AsofDate Tickers Prices returns
# <date> <fct> <dbl> <dbl>
# 1 2018-01-01 Ticker1 1 NA
# 2 2018-01-02 Ticker1 2 0.5
# 3 2018-01-03 Ticker1 7 0.714
# 4 2018-01-04 Ticker1 4 -0.75
# 5 2018-01-05 Ticker1 2 -1
# 6 2018-01-01 Ticker2 6 NA
# 7 2018-01-02 Ticker2 5 -0.2
# 8 2018-01-03 Ticker2 7 0.286
# 9 2018-01-04 Ticker2 9 0.222
#10 2018-01-05 Ticker2 12 0.25
#11 2018-01-01 Ticker3 11 NA
#12 2018-01-02 Ticker3 11 0
#13 2018-01-03 Ticker3 16 0.312
#14 2018-01-04 Ticker3 14 -0.143
#15 2018-01-05 Ticker3 15 0.0667

In base R, we can use ave with diff

df$returns <- with(df, ave(Prices, Tickers,FUN = function(x) c(NA,diff(x)))/Prices)


Related Topics



Leave a reply



Submit