Simulate Lag Function in MySQL

Simulate lag function in MySQL

This is my favorite MySQL hack.

This is how you emulate the lag function:

SET @quot=-1;
select time,company,@quot lag_quote, @quot:=quote curr_quote
from stocks order by company,time;
  • lag_quote holds the value of previous row's quote. For the first row @quot is -1.
  • curr_quote holds the value of current row's quote.

Notes:

  1. order by clause is important here just like it is in a regular
    window function.
  2. You might also want to use lag for company just to be sure that you are computing difference in quotes of the same company.
  3. You can also implement row counters in the same way @cnt:=@cnt+1

The nice thing about this scheme is that is computationally very lean compared to some other approaches like using aggregate functions, stored procedures or processing data in application server.

EDIT:

Now coming to your question of getting result in the format you mentioned:

SET @quot=0,@latest=0,company='';
select B.* from (
select A.time,A.change,IF(@comp<>A.company,1,0) as LATEST,@comp:=A.company as company from (
select time,company,quote-@quot as change, @quot:=quote curr_quote
from stocks order by company,time) A
order by company,time desc) B where B.LATEST=1;

The nesting is not co-related so not as bad (computationally) as it looks (syntactically) :)

Let me know if you need any help with this.

Simulate LAG with GROUP BY in MySQL 5.6

You could use two correlated subqueries:

select
t.*,
(
select date
from mytable t1
where t1.itemid = t.itemid and t1.date < t.date
order by t.date desc
limit 1
) previous_date,
(
select info
from mytable t1
where t1.itemid = t.itemid and t1.date < t.date
order by t.date desc
limit 1
) previous_info
from mytable t

However this does not scale well when you need to recover more columns from the previous record. In that case, we can self-join with a not exists condition to filter on the previous record:

select
t.*,
tlag.date previous_date,
tlag.info previous_info
from mytable t
left join mytable tlag
on tlag.itemid = t.itemid
and tlag.date < t.date
and not exists (
select 1
from mytable t1
where t1.itemid = t.itemid and t1.date < t.date and t1.date > tlag.date
)

For performance of both queries, consider the following an index on (item_id, date). You might want to add info to the index, like: (item_id, date, info), especially with the first query, so both subqueries are covered by the index.

Simulating lag function on MySQL

Ok, so after some more digging, and the suggestion by @Marc B, I finally got to where I wanted. In case anyone else needs help, here's my answer:

set @price = (select preconormal from precos where codigowine='10088' order by timestamp asc limit 1);

On the line above I'm setting @price to the first record instead of zero, to avoid getting a lag_price of 65.60 on the first row.

Below is the actual result:

select codigowine, preconormal - @price as lag_price, @price:=preconormal curr_price from precos WHERE codigowine='10088' order by timestamp;

Here's the fiddle working.

how to do lag operation in mysql

You can emulate it with user variables:

select uid, operation, previous_operation from (
select
y.*
, @prev AS previous_Operation
, @prev := Operation
from
your_table y
, (select @prev:=NULL) vars
order by uid
) subquery_alias
  • see it working in an sqlfiddle live

Here you initialize your variable(s). It's the same as writing SET @prev:=NULL; before writing your query.

, (select @prev:=NULL) vars

Then the order of these statements in the select clause is important:

, @prev AS previous_Operation
, @prev := Operation

The first just displays the variables value, the second assigns the value of the current row to the variable.

It's also important to have an ORDER BY clause, as the output is otherwise not deterministic.

All this is put into a subquery just out of aesthetic reasons,... to filter out this

, @prev := Operation

column.

Simulating LAG in MySQL during an update

I don't think you can set variables in an update statement.
Here's my reasoning-
Given this

drop table if exists t;

create table t (ID int, TestDate date, PerformanceStatus int, previousperformancestatus int);
insert into t values
(1 , '2016-03-15' , 0, null),
(1 , '2016-04-01' , 2, null),
(1 , '2016-05-05' , 1, null),
(1 , '2016-06-07' , 1, null),
(2 , '2016-03-15' , 0, null),
(2 , '2016-04-01' , 2, null),
(2 , '2016-05-05' , 1, null),
(2 , '2016-06-07' , 3, null),
(2 , '2016-08-23' , 1, null)
;

This code fails

MariaDB [sandbox]> SET @lag = 0;
Query OK, 0 rows affected (0.00 sec)

MariaDB [sandbox]> UPDATE T
-> SET previousPerformanceStatus = @lag ,
-> @lag:=PerformanceStatus
-> ORDER BY ID, TestDate;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '@lag:=PerformanceStatus
ORDER BY ID, TestDate' at line 3

Commenting out @lag:=PerformanceStatus
This code runs

MariaDB [sandbox]> SET @lag = 0;
Query OK, 0 rows affected (0.00 sec)

MariaDB [sandbox]> UPDATE T
-> SET previousPerformanceStatus = @lag
-> #,@lag:=PerformanceStatus
-> ORDER BY ID, TestDate;
Query OK, 0 rows affected (0.01 sec)
Rows matched: 9 Changed: 0 Warnings: 0

Since code at least runs without error and the manual https://dev.mysql.com/doc/refman/5.7/en/update.html states "The SET clause indicates which columns to modify " my take on this is that you cannot set variables in an update statement so simulating lag is not possible using this method.

Translate Lead/Lag function to MySQL

You could express the lead() and lag() function via subquery

select user_id, month(date_time) as visit_month,
month(date_time)-(select month(date_time) from tbl_activity
where user_id = a.user_id and
month(date_time) < month(a.date_time)
order by month(date_time) desc LIMIT 1) as time_diff -- This could be re-express via lag() function
from tbl_activity a
group by user_id, month(date_time);

In above you would need to just specify < / > to express the lead() and lag() functionality in subquery and don't forget to use order by clause

EDIT (Lead) :

select user_id, month(date_time) as visit_month,
(select month(date_time) from tbl_activity
where user_id = a.user_id and
month(date_time) > month(a.date_time)
order by month(date_time) LIMIT 1) as lead -- This could be re-express via lead() function
from tbl_activity a
group by user_id, month(date_time);

Mysql - Simulate lag with multiple previous rows (not just one)

I believe it is as simple as this, the subquery inside select counts the number of purchases within past 30 days:

SELECT t.*, (
SELECT COUNT(*)
FROM t AS x
WHERE userId = t.userId
AND purchaseDate < t.purchaseDate
AND purchaseDate >= t.purchaseDate - INTERVAL 30 DAY
) AS purchases_within_30days
FROM t

MySQL lag/lead function?

This is better done on application level, but just for fun, here it is on database level:

select `user`, `start`, `stop`, diff from (
select
t.*
, if(@prev_user = `user`, (`stop` - @prev) * -1, 0) as diff
, @prev := `start`
, @prev_user := `user`
from
t
, (select @prev := null, @prev_user := null) var_init
order by `user`, `start` desc
) sq
order by `user`, `start`
  • see it working live in an sqlfiddle

Note, that there are no lag/lead functions in MySQL. All you can do, is to use variables. The SELECT clause gets processed one line at a time. So you can assign the value of the current row in the last lines of the SELECT clause and therefore use this variable as the value of "the previous row" in the first lines of the SELECT clause.

Also note, that the ORDER BY is very important. A table is not sorted. Data in a relational DBMS is not sorted unless you specify an order with the ORDER BY clause.

  • read more about using variables in queries here

EDIT:

Change it to

UPDATE inactivitytmp
JOIN (
SELECT
inactivitytmp.*
, if(@prev_user_id = `user_id`, (`end_ts` - @prev) * -1, 0) as diff2
, @prev := `start_ts`
, @prev_user_id := `user_id`
FROM
inactivitytmp
, (SELECT @prev := null, @prev_user_id := null) var_init
ORDER BY `user_id`, `start_ts` DESC
) query_alias
ON inactivitytmp.user_id=query_alias.user_id AND inactivitytmp.start_ts=q uery_alias.start_ts AND inactivitytmp.end_ts=query_alias.end_ts
SET inactivitytmp.diff=query_alias.diff2;

Update Value based on previous row in MySQL

You can use a JOIN and window functions in an UPDATE. One method is:

update t join
(select t.*, lag(number) over (order by cid) as prev_number
from t
) tt
on tt.cid = t.cid
set rum = (number - prev_number) * 100 / prev_number;


Related Topics



Leave a reply



Submit