Merge adjacent rows in SQL?
This article provides quite a few possible solutions to your question
http://www.sqlmag.com/blog/puzzled-by-t-sql-blog-15/tsql/solutions-to-packing-date-and-time-intervals-puzzle-136851
This one seems like the most straight forward:
WITH StartTimes AS
(
SELECT DISTINCT username, starttime
FROM dbo.Sessions AS S1
WHERE NOT EXISTS
(SELECT * FROM dbo.Sessions AS S2
WHERE S2.username = S1.username
AND S2.starttime < S1.starttime
AND S2.endtime >= S1.starttime)
),
EndTimes AS
(
SELECT DISTINCT username, endtime
FROM dbo.Sessions AS S1
WHERE NOT EXISTS
(SELECT * FROM dbo.Sessions AS S2
WHERE S2.username = S1.username
AND S2.endtime > S1.endtime
AND S2.starttime <= S1.endtime)
)
SELECT username, starttime,
(SELECT MIN(endtime) FROM EndTimes AS E
WHERE E.username = S.username
AND endtime >= starttime) AS endtime
FROM StartTimes AS S;
Merge two consecutive rows into a column
Provided mytable
contains only valid pairs of in / out events
select EmployeeCode,
max(case EntryType when 'IN' then TIME end ) IN_TIME,
max(case EntryType when 'OUT' then TIME end ) OUT_TIME
from (
select EmployeeCode, EntryType, TIME,
row_number() over(partition by EmployeeCode, EntryType order by TIME) rn
from mytable
)t
group by EmployeeCode, rn
order by EmployeeCode, rn
Otherwise a kind of clean-up is required first.
Merge groups of consecutive rows in T-SQL and sum values from each group
This is a gaps-and-islands problem, as GMB points out. Because you want to keep the first row, I'm going to suggest a lag()
approach instead of the difference of row numbers:
SELECT SessionId, TransactionType, DateTimeEnd,DateStart, sumRank
FROM (SELECT t.*,
SUM(Rank) OVER (PARTITION BY SessionId, TransactionType, grp) as sumRank
FROM (SELECT t.*,
SUM(CASE WHEN prev_st_id = prev_id THEN 0 ELSE 1 END) OVER (ORDER BY id) as grp
FROM (SELECT t.*,
LAG(id) OVER (PARTITION BY SessionId, TransactionType ORDER BY id) as prev_st_id,
LAG(id) OVER (PARTITION BY SessionId ORDER BY id) as prev_id
FROM Transactions t
) t
) t
) t
WHERE prev_st_id <> prev_id OR prev_st_id IS NULL;
What does this do?
- Innermost subquery calculates the lag of the id both overall and by session/transaction type. This uses
id
because it seems more stable than date/times (there are duplicate date/time values in one of the columns). - When the ids are different, then a new island is identified. The cumulative sum identifies the groups.
- This
grp
is then used to calculate values over the entire group, using window functions. - The outer query then just filters down to the first row in each group.
Here is a db<>fiddle.
Merge consecutive duplicate rows in SQL Server
You can use row_number()
s & find the groups to do aggregation :
select item, price, min(validfrom) as validfrm, max(validto) as valdto
from (select t.*,
row_number() over(partition by item order by validfrom desc) as seq1,
row_number() over(partition by item, price order by validfrom desc) as seq2
from table t
) t
group by item, price, (seq1 - seq2)
order by item, validfrm desc;
Microsoft SQL Server 2012 merge data of exactly two adjacent rows into one row
For creating the table and inserting some values
create table mastertable
(
id int,
Name varchar(20),
ParentID int,
Value varchar(50)
)
insert into mastertable values(86, 'Description1',83,'021541-sds2-45515')
insert into mastertable values(87, 'value1',83, '$247.56')
insert into mastertable values(90, 'Description3',83, '021541-sds2-45567')
insert into mastertable values(91,'value3',83, '$66.72')
insert into mastertable values(94, 'Description5',83, '021541-sd72-47715')
insert into mastertable values(95,'value5',83, '$57.11')
Now, try this query:
with m1 as
(
SELECT *, RN = ROW_NUMBER() OVER (order by id) FROM mastertable
),
m2 as
(
SELECT *, RN = ROW_NUMBER() OVER (order by id) FROM mastertable
)
SELECT m1.id, m1.name, m1.parentid, m1.value as value1, m2.value as value2
FROM m1 left join m2
ON m1.RN = m2.RN-1
WHERE m1.RN%2 != 0
EDIT:
For m2, instead of SELECT *, you could also use:
m2 as
(
SELECT id,value, RN = ROW_NUMBER() OVER (order by id) FROM mastertable
)
Combining consecutive rows based on a type column
Gordon beat me to the answer. Yes, this does fit into an Islands and Gaps methodology. I think a LEAD() would suit this issue quite well. But I also tried a second query with a ROW_NUMBER(), which produced a slightly shorter execution plan. Not sure which would be more efficient at scale. That would require a bit more testing.
NOTE 1: I also added an assumed SessionID and UserID to my query. The additional columns may change your final results.
NOTE 2: SQL Fiddle reports that the ROW_NUMBER version runs much faster with fewer "same time" entries, but LEAD version is faster with many "same time" entries.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE foo ( ID int IDENTITY, sessionID int, userid int, AuditType varchar(50), [DateTime] datetime ) ;
INSERT INTO foo ( sessionID, userID, AuditType, [DateTime] )
VALUES
(1,1,'Viewed Record','2017-01-03 11:16:33.000')
, (1,1,'Saved Form','2017-01-04 09:51:36.837')
, (2,2,'Viewed Record','2017-01-04 09:52:00.000')
, (1,1,'Saved Form','2017-01-04 09:52:40.837')
, (1,1,'Saved Form','2017-01-04 09:52:44.837')
, (2,2,'Saved Form','2017-01-04 09:52:45.000')
, (2,2,'Saved Form','2017-01-04 09:52:46.000')
, (2,2,'Saved Form','2017-01-04 09:52:47.000')
, (2,2,'Saved Form','2017-01-04 09:52:48.000')
, (1,1,'Saved Form','2017-01-04 09:52:49.837')
, (1,1,'Saved Form','2017-01-04 09:52:54.837')
, (2,2,'Exported Record','2017-01-04 09:53:00.000')
, (1,1,'Saved Form','2017-01-04 09:54:59.837')
, (1,1,'Exported Record','2017-01-04 09:55:59.837')
, (2,1,'Viewed Record','2017-01-04 10:00:00.000')
, (2,1,'Saved Form','2017-01-04 10:02:00.000')
, (2,1,'Saved Form','2017-01-04 10:04:00.000')
, (2,1,'Saved Form','2017-01-04 10:06:00.000')
, (2,1,'Exported Record','2017-01-04 10:10:00.000')
;
Query 1 (LEAD()):
SELECT s1.sessionID
, s1.userID
, s1.AuditType
, s1.[DateTime]
FROM (
SELECT foo.*
, LEAD(foo.AuditType) OVER ( ORDER BY foo.userID, foo.sessionID, foo.[DateTime] ) AS next_type
FROM foo
) s1
WHERE s1.next_type IS NULL OR s1.next_type <> s1.AuditType
ORDER BY s1.sessionID, s1.userID, s1.[DateTime]
Results:
| sessionID | userID | AuditType | DateTime |
|-----------|--------|-----------------|--------------------------|
| 1 | 1 | Viewed Record | 2017-01-03T11:16:33Z |
| 1 | 1 | Saved Form | 2017-01-04T09:54:59.837Z |
| 1 | 1 | Exported Record | 2017-01-04T09:55:59.837Z |
| 2 | 1 | Viewed Record | 2017-01-04T10:00:00Z |
| 2 | 1 | Saved Form | 2017-01-04T10:06:00Z |
| 2 | 1 | Exported Record | 2017-01-04T10:10:00Z |
| 2 | 2 | Viewed Record | 2017-01-04T09:52:00Z |
| 2 | 2 | Saved Form | 2017-01-04T09:52:48Z |
| 2 | 2 | Exported Record | 2017-01-04T09:53:00Z |
Query 2 (ROW_NUMBER()):
SELECT s1.*
FROM (
SELECT foo.*
, ROW_NUMBER() OVER ( PARTITION BY foo.userID, foo.sessionID, foo.AuditType ORDER BY foo.userID, foo.sessionID, foo.[DateTime] DESC ) AS rn
FROM foo
) s1
WHERE rn = 1
ORDER BY s1.sessionID, s1.userID, s1.[DateTime]
Results:
| ID | sessionID | userid | AuditType | DateTime | rn |
|----|-----------|--------|-----------------|--------------------------|----|
| 1 | 1 | 1 | Viewed Record | 2017-01-03T11:16:33Z | 1 |
| 13 | 1 | 1 | Saved Form | 2017-01-04T09:54:59.837Z | 1 |
| 14 | 1 | 1 | Exported Record | 2017-01-04T09:55:59.837Z | 1 |
| 15 | 2 | 1 | Viewed Record | 2017-01-04T10:00:00Z | 1 |
| 18 | 2 | 1 | Saved Form | 2017-01-04T10:06:00Z | 1 |
| 19 | 2 | 1 | Exported Record | 2017-01-04T10:10:00Z | 1 |
| 3 | 2 | 2 | Viewed Record | 2017-01-04T09:52:00Z | 1 |
| 9 | 2 | 2 | Saved Form | 2017-01-04T09:52:48Z | 1 |
| 12 | 2 | 2 | Exported Record | 2017-01-04T09:53:00Z | 1 |
They should both show:
1,1,'Viewed Record','2017-01-03 11:16:33.000'
1,1,'Saved Form','2017-01-04 09:54:59.837'
1,1,'Exported Record','2017-01-04 09:55:59.837'
2,1,'Viewed Record','2017-01-04 10:00:00.000'
2,1,'Saved Form','2017-01-04 10:06:00.000'
2,1,'Exported Record','2017-01-04 10:10:00.000'
2,2,'Viewed Record','2017-01-04 09:52:00.000'
2,2,'Saved Form','2017-01-04 09:52:48.000'
2,2,'Exported Record','2017-01-04 09:53:00.000'
Need help with a SQL query that combines adjacent rows into a single row
I don't know whether Russ' answer helps you at all. Here is a link to an article that explains how to add row numbers to the results of a query. (Search for "row_number" to find the most likely example.)
Once you have a query numbering the rows properly, you should be able to throw that into a CTE, then select from it twice -- once for odd numbers, then again for even numbers. Have each result return the even numbered value for joining (odd - 1 = even). At that point, you can join the results of the queries and get two products on one row.
Combining adjacent rows based on comparision of values from different colums
This is a form of gaps-and-islands. You can solve it by finding where the islands start -- say by using lag()
and then using a cumulative sum (to identify the groups) and aggregation:
select category, grp, min(date_start), max(date_end),
sum(count)
from (select t.*,
sum(case when prev_date_end >= date_start then 0 else 1 end) over (partition by category order by date_start) as grp
from (select t.*,
lag(date_end) over (partition by category order by date_start) as prev_date_end
from t
) t
) t
group by category, grp;
How to merge consecutive rows of date and time using MySQL
With LAG()
window function:
select Data, Date, Time
from (
select *,
concat(Date, ' ', Time) - interval 30 minute <=
lag(concat(Date, ' ', Time)) over (partition by Data order by Date, Time) flag
from tablename
) t
where coalesce(flag, 0) = 0
order by Data, Date, Time
See the demo.
Results:
> Data | Date | Time
> :---- | :--------- | :-------
> Data1 | 2020-08-19 | 13:00:00
> Data1 | 2020-08-21 | 07:00:00
> Data2 | 2020-08-20 | 08:00:00
Related Topics
Always Show Decimal Places in SQL
Pass Extra Parameter to Postgresql Aggregate Final Function
How to Use the Results of a Stored Procedure from Within Another
"Invalid Column Name" Error on SQL Statement from Openquery Results
Why Is Variable Declared Inside If Statement Created Even When Condition Evaluates to False
There Is Already an Object Named '#Result' in the Database
Copy Multiple Records with a Master-Details Relationship
How to Make This SQL More Efficient
How SQL's Convert Function Work When Converting Datetime to Float
Splitting Variable Length Delimited String Across Multiple Rows (Sql)
Using Patindex to Find Varying Length Patterns in T-Sql