Date Difference Between Consecutive Rows

Date Difference between consecutive rows

SELECT  T1.ID, 
T1.AccountNumber,
T1.Date,
MIN(T2.Date) AS Date2,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
LEFT JOIN YourTable T2
ON T1.AccountNumber = T2.Accountnumber
AND T2.Date > T1.Date
GROUP BY T1.ID, T1.AccountNumber, T1.Date;

or

SELECT  ID,
AccountNumber,
Date,
NextDate,
DATEDIFF("D", Date, NextDate)
FROM ( SELECT ID,
AccountNumber,
Date,
( SELECT MIN(Date)
FROM YourTable T2
WHERE T2.Accountnumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T

SQL: Difference between consecutive rows

You can use lag() to get the date of the previous order by the same customer:

select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o

When there are two rows for the same date, the smallest order_id is considered first. Also note that I fixed your datediff() syntax: in Hive, the function just takes two dates, and no unit.

I just don't get the logic you want to compute num_orders.

Sql function find date difference between two consecutive rows per group

Based on @Akina hint I was able to get the required output from following query
Note:- My sessionDate column was in Date and timestamp format.

select
t.PatientID,

CAST(t.SessionTimestamp as Date) as SessionDate,

LAG(t.SessionTimestamp) over(partition by t.PatientRecordId order by t.PatientID,t.SessionTimestamp) as Next_Date


from mytable t

order by
t.PatientID,
t.SessionTimestamp

Calculate Time Difference Between Two Consecutive Rows

How about this:

select recordid, transdate,
cast( (transdate - lag(transdate) over (order by transdate)) as time) as diff
from t;

In other words, you can subtract two datetime values and cast the result as a time. You can then format the result however you like.

Calculate Date difference between two consecutive rows

You can use LAG, LEAD window functions for this:

SELECT ID
FROM (
SELECT ID, [DateTime],
DATEDIFF(mi, LAG([DateTime]) OVER (ORDER BY ID), [DateTime]) AS prev_diff,
DATEDIFF(mi, [DateTime], LEAD([DateTime]) OVER (ORDER BY ID)) AS next_diff
FROM mytable) AS t
WHERE prev_diff >= 5 OR next_diff >= 5

Output:

ID
==
5
6

Note: The above query assumes that order is defined by ID field. You can easily substitute this field with any other field that specifies order in your table.

Getting date difference between consecutive rows in the same group

The code would look something like:

select t.*,
datediff(second, lag(time) over (partition by group order by id), time)
from t;

This returns the difference as a number of seconds, but you seem to know how to convert that to a time representation. You also seem to know that group is not acceptable as a column name, because it is a SQL keyword.

Based on the question, you have put group in the order by clause of the lag(), not the partition by.

calculate time differences between consecutive rows using pandas?

Try this example:

import pandas as pd
import io

s = io.StringIO('''
dates,nums
2017-02-01T00:00:01,1
2017-02-01T00:00:01,2
2017-02-01T00:00:06,3
2017-02-01T00:00:07,4
2017-02-01T00:00:10,5
''')

df = pd.read_csv(s)

Currently the frame looks like this:

nums is nothing and just there to be a secondary column of "something".

                 dates  nums
0 2017-02-01T00:00:01 1
1 2017-02-01T00:00:01 2
2 2017-02-01T00:00:06 3
3 2017-02-01T00:00:07 4
4 2017-02-01T00:00:10 5

Carrying on:

# format as datetime
df['dates'] = pd.to_datetime(df['dates'])

# shift the dates up and into a new column
df['dates_shift'] = df['dates'].shift(-1)

# work out the diff
df['time_diff'] = (df['dates_shift'] - df['dates']) / pd.Timedelta(seconds=1)

# remove the temp column
del df['dates_shift']

# see what you've got
print(df)

dates nums time_diff
0 2017-02-01 00:00:01 1 0.0
1 2017-02-01 00:00:01 2 5.0
2 2017-02-01 00:00:06 3 1.0
3 2017-02-01 00:00:07 4 3.0
4 2017-02-01 00:00:10 5 NaN

To get the absolute values change this line above:

df['time_diff'] = (df['dates_shift'] - df['dates']) / pd.Timedelta(seconds=1)

To:

df['time_diff'] = (df['dates_shift'] - df['dates']).abs() / pd.Timedelta(seconds=1)

Date difference between non-consecutive rows with random time differences and group between rows in T-SQL

If you want one row per category, you can use lag() and lead():

select userid, category, datetimestart,
lead(datetimestart) over (partition by userid order by datetimestart) as datetimeend
from (select t.*,
lag(category) over (partition by userid order by datetimestart) as prev_category
from t
) t
where prev_category is null or prev_category <> category;

You can get the difference using whatever method you prefer. Because this avoids aggregation, this should be the fastest method.

And, if you want to remove the END row, then use the above as a subquery or CTE and filter in the outer query.



Related Topics



Leave a reply



Submit