Average Stock History Table

Average stock history table

The special difficulty of this task: you cannot just pick data points inside your time range, but have to consider the latest data point before the time range and the earliest data point after the time range additionally. This varies for every row and each data point may or may not exist. Requires a sophisticated query and makes it hard to use indexes.

You could use range types and operators (Postgres 9.2+) to simplify calculations:

WITH input(a,b) AS (SELECT '2013-01-01'::date  -- your time frame here
, '2013-01-15'::date) -- inclusive borders
SELECT store_id, product_id
, sum(upper(days) - lower(days)) AS days_in_range
, round(sum(value * (upper(days) - lower(days)))::numeric
/ (SELECT b-a+1 FROM input), 2) AS your_result
, round(sum(value * (upper(days) - lower(days)))::numeric
/ sum(upper(days) - lower(days)), 2) AS my_result
FROM (
SELECT store_id, product_id, value, s.day_range * x.day_range AS days
FROM (
SELECT store_id, product_id, value
, daterange (day, lead(day, 1, now()::date)
OVER (PARTITION BY store_id, product_id ORDER BY day)) AS day_range
FROM stock
) s
JOIN (
SELECT daterange(a, b+1) AS day_range
FROM input
) x ON s.day_range && x.day_range
) sub
GROUP BY 1,2
ORDER BY 1,2;

Note, I use the column name day instead of date. I never use basic type names as column names.

In the subquery sub I fetch the day from the next row for each item with the window function lead(), using the built-in option to provide "today" as default where there is no next row.

With this I form a daterange and match it against the input with the overlap operator &&, computing the resulting date range with the intersection operator *.

All ranges here are with exclusive upper border. That's why I add one day to the input range. This way we can simply subtract lower(range) from upper(range) to get the number of days.

I assume that "yesterday" is the latest day with reliable data. "Today" can still change in a real life application. Consequently, I use "today" (now()::date) as exclusive upper border for open ranges.

I provide two results:

  • your_result agrees with your displayed results.

    You divide by the number of days in your date range unconditionally. For instance, if an item is only listed for the last day, you get a very low (misleading!) "average".

  • my_result computes the same or higher numbers.

    I divide by the actual number of days an item is listed. For instance, if an item is only listed for the last day, I return the listed value as average.

To make sense of the difference I added the number of days the item was listed: days_in_range

SQL Fiddle.

Index and performance

For this kind of data, old rows typically don't change. This would make an excellent case for a materialized view:

CREATE MATERIALIZED VIEW mv_stock AS
SELECT store_id, product_id, value
, daterange (day, lead(day, 1, now()::date) OVER (PARTITION BY store_id, product_id
ORDER BY day)) AS day_range
FROM stock;

Then you can add a GiST index which supports the relevant operator &&:

CREATE INDEX mv_stock_range_idx ON mv_stock USING gist (day_range);

Big test case

I ran a more realistic test with 200k rows. The query using the MV was about 6 times as fast, which in turn was ~ 10x as fast as @Joop's query. Performance heavily depends on data distribution. An MV helps most with big tables and high frequency of entries. Also, if the table has columns that are not relevant to this query, a MV can be smaller. A question of cost vs. gain.

I've put all solutions posted so far (and adapted) in a big fiddle to play with:

SQL Fiddle with big test case.

SQL Fiddle with only 40k rows - to avoid timeout on sqlfiddle.com

Calculating and updating table with simple moving average of closing stock prices in MYSQL

There are two ways of doing this:

  1. Create an update query that updates every row in your table
  2. Create a stored procedure that does the job

I personally prefer option 2:

delimiter $$
create procedure movingAvg()
begin
declare mv double;
declare t date;
declare done int default false;
declare cur_t cursor for
select distinct day from symbol
order by day;
declare cur_mv cursor for
select avg(close) from symbol
where day between date_add(t, interval -19 day) and t;
-- Here you define the interval of your MV.
-- If you want a 20-day MV, then the interval is between t-19 and t
declare continue handler for not found set done=true;

open cur_t;
loop_day: loop
fetch cur_t into t;
if not done then
open cur_mv;
fetch cur_mv into mv;
close cur_mv;
update SYMBOL
set moving_average = mv
where day=t;
else
leave loop_day;
end if;
end loop loop_day;
close cur_t;
end;
delimiter ;

How to get average stock price from an array?

=ARRAYFORMULA(AVERAGE.WEIGHTED(
QUERY(IF(C4:C12-G4:G12<>0, D4:D12, ), "where Col1 is not null", 0),
QUERY(IF(C4:C12-G4:G12<>0, C4:C12-G4:G12, ), "where Col1 is not null", 0)))

Sample Image

Get average stock price in Google Sheets from a list

use:

=AVERAGE.WEIGHTED(FILTER(E3:E6; B3:B6="buy"); FILTER(D3:D6; B3:B6="buy"))

Sample Image



update 1:

=INDEX(AVERAGE.WEIGHTED(FILTER(E3:E6; B3:B6="buy"); FILTER(D3:D6; B3:B6="buy")-
{SEQUENCE(COUNTIF(B3:B6; "buy")-1; 1; 0; 0); FILTER(D3:D6; B3:B6="sell")}))

Sample Image



update 2:

=ARRAYFORMULA(AVERAGE.WEIGHTED(FILTER(E3:E6; B3:B6="buy"); IF(0<FILTER(D3:D6; B3:B6="buy")+
IF(0>MMULT(TRANSPOSE((SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))>=
TRANSPOSE(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))))*
QUERY({FILTER(D3:D6; B3:B6="buy");-SUM(FILTER(D3:D6; B3:B6="sell"))}; "offset 1"; 0));
SIGN(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))));
MMULT(TRANSPOSE((SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))>=
TRANSPOSE(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))))*
QUERY({FILTER(D3:D6; B3:B6="buy");-SUM(FILTER(D3:D6; B3:B6="sell"))}; "offset 1"; 0));
SIGN(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy"))))); 0); FILTER(D3:D6; B3:B6="buy")+
IF(0>MMULT(TRANSPOSE((SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))>=
TRANSPOSE(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))))*
QUERY({FILTER(D3:D6; B3:B6="buy");-SUM(FILTER(D3:D6; B3:B6="sell"))}; "offset 1"; 0));
SIGN(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))));
MMULT(TRANSPOSE((SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))>=
TRANSPOSE(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy")))))*
QUERY({FILTER(D3:D6; B3:B6="buy");-SUM(FILTER(D3:D6; B3:B6="sell"))}; "offset 1"; 0));
SIGN(SEQUENCE(COUNTA(FILTER(D3:D6; B3:B6="buy"))))); 0); 0)))

Sample Image

How do I refer to the historical data of different stocks in Pinescript?

You should use the security function for this together with the History reference operator

The first parameter is the ticker, so you can write the stock's name you want to get the data from there.

Then you can request previous day's close price like below:

//@version=5
indicator("My script", overlay=true)
apple_prev_day_close = request.security("BATS:AAPL", "D", close[1])
plot(apple_prev_day_close)


Related Topics



Leave a reply



Submit