Group by Values That Are in Sequence

How to group the same values which is in sequence order

This is a gaps and islands problem. Here is one way to solve it using lag() and a cumulative sum():

select
    min(num) num,
    count(*) count_num
from (
    select
        t.*,
        sum(case when num = lag_num then 0 else 1 end) over(order by id) grp
    from (
        select 
            t.*,
            lag(num) over(order by id) lag_num
        from #temp t
    ) t
) t
group by grp

Demo on DB Fiddlde:


num | count_num
--: | --------:
  1 |         3
  2 |         1
  1 |         1
  2 |         2
  3 |         3

SUM Values by Sequence Number and Group By Flag

This is a classic gaps-and-islands problem.

However, in this case the start of each island is clearly delineated by a P (or a row that is not C). So we don't need LAG for that.

We just need to assign a grouping ID for each island, which we can do using a windowed conditional COUNT. Then we simply group by that ID.

SELECT
  pv.SalesOrder,
  SalesOrderLine = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.SalesOrderLine END),
  MStockCode = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MStockCode END),
  MPrice = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MPrice END),
  MBomFlag = MIN(CASE WHEN pv.MBomFlag <> 'C' THEN pv.MBomFlag END)
FROM (
    SELECT *,
      GroupingId = COUNT(NULLIF(t.MBomFlag, 'C')) OVER (PARTITION BY t.SalesOrder ORDER BY t.SalesOrderLine ROWS UNBOUNDED PRECEDING)
    FROM @tbl t
) pv
GROUP BY
  pv.SalesOrder,
  pv.GroupingId;

Note that NULLIF(t.MBomFlag, 'C') returns null if the flag is C, so COUNT will only count the other rows. You could also write that explicitly using COUNT(CASE WHEN t.MBomFlag = 'P' THEN 1 END)

db<>fiddle

Group by numbers that are in sequence

create table #temp
(
    IDUnique int Identity(1,1),
    ID int, 
    grp int
)

Insert into #temp(ID) Values(1)
Insert into #temp(ID) Values(36)
Insert into #temp(ID) Values(37)
Insert into #temp(ID) Values(38)
Insert into #temp(ID) Values(50)
Insert into #temp(ID) Values(51)

declare @IDUnique int
declare @PreviousUnique int
declare @ID int
declare @grp int
declare @Previous int
declare @Row int

DECLARE @getAccountID CURSOR SET @getAccountID = CURSOR FOR SELECT Row_Number() Over(Order by IDUnique) Row, IDUnique, ID  From #temp
OPEN @getAccountID
FETCH NEXT FROM @getAccountID INTO @Row, @IDUnique, @ID 
WHILE @@FETCH_STATUS = 0
BEGIN
    IF(@Row = 1)
    Begin
        update #temp set grp = 1 Where IDUnique = @IDUnique
        set @Previous = @ID
        set @grp = 1
    End
    Else If (@Previous + 1 = @ID)
    Begin
        update #temp set grp = @grp Where IDUnique = @IDUnique
        set @Previous = @ID
    End
    Else
    Begin
        set @Previous = @ID
        set @grp = @grp + 1
        update #temp set grp = @grp Where IDUnique = @IDUnique
    End
    FETCH NEXT FROM @getAccountID INTO @Row, @IDUnique, @ID
END
CLOSE @getAccountID
DEALLOCATE @getAccountID

Select * from #temp
Drop Table #temp

Group by sequence of True

Identify the groups with cumsum then filter the rows having True values and use factorize to assign the ordinal number to each unique group

m = df['val_a']
df.loc[m, 'tx'] = (~m).cumsum()[m].factorize()[0]

Alternatively you can also use groupby + ngroup

m = df['val_a']
df['tx'] = m[m].groupby((~m).cumsum()).ngroup()

    val_a   tx
0    True  0.0
1    True  0.0
2   False  NaN
3   False  NaN
4   False  NaN
5    True  1.0
6   False  NaN
7   False  NaN
8    True  2.0
9    True  2.0
10   True  2.0
11   True  2.0
12  False  NaN
13   True  3.0
14   True  3.0

GROUP BY and aggregate sequential numeric values

There's much value to @a_horse_with_no_name's answer, both as a correct solution and, like I already said in a comment, as a good material for learning how to use different kinds of window functions in PostgreSQL.

And yet I cannot help feeling that the approach taken in that answer is a bit too much of an effort for a problem like this one. Basically, what you need is an additional criterion for grouping before you go on aggregating years in arrays. You've already got company and profession, now you only need something to distinguish years that belong to different sequences.

That is just what the above mentioned answer provides and that is precisely what I think can be done in a simpler way. Here's how:

WITH MarkedForGrouping AS (
  SELECT
    company,
    profession,
    year,
    year - ROW_NUMBER() OVER (
      PARTITION BY company, profession
      ORDER BY year
    ) AS seqID
  FROM atable
)
SELECT
  company,
  profession,
  array_agg(year) AS years
FROM MarkedForGrouping
GROUP BY
  company,
  profession,
  seqID

Group by sequential data in R

Here is one dplyr option -

library(dplyr)

df %>%
  group_by(gene_name) %>%
  mutate(grp =  gene_number - lag(gene_number, default = 0) > 2) %>%
  group_by(grp = cumsum(grp)) %>%
  filter(n() > 1) %>%
  ungroup
  
#  gene_name          gene_number   grp
#  <chr>                    <int> <int>
#1 ENSMUSG00000000001        4732     1
#2 ENSMUSG00000000001        4733     1
#3 ENSMUSG00000000058        7603     2
#4 ENSMUSG00000000058        7604     2
#5 ENSMUSG00000000058        8246     3
#6 ENSMUSG00000000058        8248     3

For each gene_name subtract the current gene_number value with the previous one and increment the group count if the difference is greater than 2. Drop the row if a group has a single row in it.

data

df <- structure(list(gene_name = c("ENSMUSG00000000001", "ENSMUSG00000000001", 
"ENSMUSG00000000058", "ENSMUSG00000000058", "ENSMUSG00000000058", 
"ENSMUSG00000000058", "ENSMUSG00000000058"), gene_number = c(4732L, 
4733L, 7603L, 7604L, 8246L, 8248L, 9001L)), 
class = "data.frame", row.names = c(NA, -7L))

SELECT / GROUP BY - sequence of a value

A bit of a complex version due to the lack of analytic functions in MySQL; I'm assuming it's the SUM of val1 you need;

SELECT SUM(val1) val1, MAX(val2) val2
FROM (
  SELECT s1.val1, COUNT(s3.val1) grouping, s1.val2 val2
  FROM sequence s1
  LEFT JOIN sequence s2 ON s1.val1 >= s2.val1
  LEFT JOIN sequence s3 ON s3.val1 = s2.val1 - 1 AND s2.val2 <> s3.val2
  GROUP BY s1.val1
) a
GROUP BY grouping
ORDER BY grouping

An SQLfiddle to test with.

If you add grouping to the selection list to make the order obvious, you get

val1    val2    grouping
15      1       0
40      2       1
65      1       2