How to calculate returns from a vector of prices?
Using your sample data, I think you mean the following:
a <- c(10.25, 11.26, 14, 13.56)
> diff(a)/a[-length(a)]
[1] 0.09853659 0.24333925 -0.03142857
diff
returns the vector of lagged differences and a[-length(a)]
drops the last element of a.
Sql = how to calculate the return vector with a time series of prices?
In Oracle, you could use the lag
analytic function:
select
price / (lag(price) over (order by i))
, ...
from PriceHistory
Here, lag(price) over (order by i)
returns the price of the previous row, in a set ordered by the i
column.
Calculate Returns over Period of Time
I suggest switching to a time series class, like xts
or zoo
. But if you just want to get it done, and learn more later, you can do it pretty easily as a data frame. Note that I have to pad the return vectors with NA
s to make it line up correctly and that a hold
of 20 really buy on 1 and sells on 1 + 20:
> library(xts)
> set.seed(2001)
> n <- 50
> hold <- 20
> price <- rep(55, n)
> walk <- rnorm(n)
> for (i in 2:n) price[i] <- price[i-1] + walk[i]
> data <- data.frame(date=as.Date("2001-05-25") + seq(n), price=price)
> data <- transform(data, return=c(diff(log(price), lag=hold), rep(NA, hold)))
If you're ready for xts
or zoo
(this should work in either), then I suggest using rollapply
to get the forward look (assuming you want the forward looking return, which makes it a lot easier to form portfolios today and see how it works into the future):
> data.xts <- xts(data[, -1], data[, 1])
> f <- function(x) log(tail(x, 1)) - log(head(x, 1))
> data.xts$returns.xts <- rollapply(data.xts$price, FUN=f, width=hold+1, align="left", na.pad=T)
The two approaches are the same:
> head(data.xts, hold+2)
price return returns.xts
[1,] 55.00000 0.026746496 0.026746496
[2,] 54.22219 0.029114744 0.029114744
[3,] 53.19811 0.047663206 0.047663206
[4,] 53.50088 0.046470723 0.046470723
[5,] 53.85202 0.041843116 0.041843116
[6,] 54.75061 0.018464467 0.018464467
[7,] 55.52704 -0.001105607 -0.001105607
[8,] 56.15930 -0.024183803 -0.024183803
[9,] 56.61779 -0.010757559 -0.010757559
[10,] 55.51042 0.005494771 0.005494771
[11,] 55.17217 0.044864991 0.044864991
[12,] 56.07005 0.025411005 0.025411005
[13,] 55.47287 0.052408720 0.052408720
[14,] 56.10754 0.034089602 0.034089602
[15,] 56.35584 0.075726190 0.075726190
[16,] 56.40290 0.072824657 0.072824657
[17,] 56.05761 0.070589032 0.070589032
[18,] 55.93916 0.069936575 0.069936575
[19,] 56.50367 0.081570964 0.081570964
[20,] 56.12105 0.116041931 0.116041931
[21,] 56.49091 0.095520517 0.095520517
[22,] 55.82406 0.137245367 0.137245367
How to calculate return in data.table?
The reason you are getting that output is because Prices[, names(Prices) != "Date"]
returns a logical vector:
> Prices[, names(Prices) != "Date"]
[1] FALSE TRUE TRUE TRUE
And because you can do calculations with logicals, you can also use diff
on a logical vector. FALSE
is then treated as a 0
and TRUE
as a 1
. So basically you were doing diff(c(0,1,1,1))
.
A possible solution for what you want:
cols <- setdiff(names(Prices),"Date")
# option 1:
Prices[, paste0(cols,"_return") := lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA)), .SDcols = cols][]
# option 2:
Prices[, paste0(cols,"_return") := lapply(.SD, function(x) c(NA,diff(x))/shift(x, fill = NA)), .SDcols = cols][]
which gives:
> Prices
Date C1 C2 C3 C1_return C2_return C3_return
1: 1985-01-31 NA 47 NA NA NA NA
2: 1985-02-28 NA 45 NA NA -0.04255319 NA
3: 1985-03-29 130 56 NA NA 0.24444444 NA
4: 1985-04-30 140 67 NA 0.07692308 0.19642857 NA
5: 1985-05-31 150 48 93 0.07142857 -0.28358209 NA
6: 1985-06-28 160 79 96 0.06666667 0.64583333 0.03225806
7: 1985-07-31 160 56 94 0.00000000 -0.29113924 -0.02083333
8: 1985-08-30 160 77 93 0.00000000 0.37500000 -0.01063830
9: 1985-09-30 160 66 93 0.00000000 -0.14285714 0.00000000
10: 1985-10-31 160 44 93 0.00000000 -0.33333333 0.00000000
11: 1985-11-29 160 55 93 0.00000000 0.25000000 0.00000000
If you want to create a new data.table
, you could use one of the following two options:
# option 1:
Returns <- Prices[, c(list(Date = Date), lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA))), .SDcols = cols]
# option 2:
Returns <- copy(Prices)
Returns[, (cols) := lapply(.SD, function(x) (x - shift(x, fill = NA))/shift(x, fill = NA)), .SDcols = cols]
Used data:
Prices <- fread("Date C1 C2 C3
31.01.1985 NA 47 NA
28.02.1985 NA 45 NA
29.03.1985 130 56 NA
30.04.1985 140 67 NA
31.05.1985 150 48 93
28.06.1985 160 79 96
31.07.1985 160 56 94
30.08.1985 160 77 93
30.09.1985 160 66 93
31.10.1985 160 44 93
29.11.1985 160 55 93")[, Date := as.Date(Date, "%d.%m.%Y")]
Calculate returns on a daily basis in R
Try this:
cbind(mydf[-1,1],apply(mydf[,-1],2,function(x) diff(x)/head(x,-1)))
Output:
Date AMZN GOOG WFM MSFT
1: 5/1/2016 -0.005023646 0.0009975062 0.011421761 0.004562044
2: 6/1/2016 -0.001798631 0.0014004928 -0.006537949 -0.018165305
3: 7/1/2016 -0.039057964 -0.0231704098 -0.027819324 -0.034782628
4: 8/1/2016 -0.001463983 -0.0164099778 -0.016000000 0.003066973
Prices returns calculation in a df with many tickers with dplyr
In dplyr
, we can use lag
to get previous Prices
library(dplyr)
df %>%
group_by(Tickers) %>%
mutate(returns = (Prices - lag(Prices))/Prices)
# AsofDate Tickers Prices returns
# <date> <fct> <dbl> <dbl>
# 1 2018-01-01 Ticker1 1 NA
# 2 2018-01-02 Ticker1 2 0.5
# 3 2018-01-03 Ticker1 7 0.714
# 4 2018-01-04 Ticker1 4 -0.75
# 5 2018-01-05 Ticker1 2 -1
# 6 2018-01-01 Ticker2 6 NA
# 7 2018-01-02 Ticker2 5 -0.2
# 8 2018-01-03 Ticker2 7 0.286
# 9 2018-01-04 Ticker2 9 0.222
#10 2018-01-05 Ticker2 12 0.25
#11 2018-01-01 Ticker3 11 NA
#12 2018-01-02 Ticker3 11 0
#13 2018-01-03 Ticker3 16 0.312
#14 2018-01-04 Ticker3 14 -0.143
#15 2018-01-05 Ticker3 15 0.0667
In base R, we can use ave
with diff
df$returns <- with(df, ave(Prices, Tickers,FUN = function(x) c(NA,diff(x)))/Prices)
Related Topics
Is Data Really Copied Four Times in R's Replacement Functions
Highlighting Individual Axis Labels in Bold Using Ggplot2
Cast Function Argument as a Character String
How to Separately Control the X and Y Axes Using Ggplot
Combining Elements of List of Lists by Index
Add One Column Below Another in a Data.Frame in R
How to Add an External Legend to Ggpairs()
R: Why Does Read.Table Stop Reading a File
Add Textbox to Facet Wrapped Layout in Ggplot2
R Dpylr Select_If with Multiple Conditions
What's My User Agent When I Parse Website with Rvest Package in R
How to Use Grid to Edit a Ggplot2 Object to Add Math Expressions to Facet Labels
Multiple Ggplot Linear Regression Lines
Given a 2D Numeric "Height Map" Matrix in R, How to Find All Local Maxima
In R, Evaluate Expressions Within Vector of Strings
Note in R Cran Check: No Repository Set, So Cyclic Dependency Check Skipped