Oracle: How to "Group By" Over a Range

Oracle: how to group by over a range?

SELECT CASE 
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END AS age,
COUNT(*) AS n
FROM age
GROUP BY CASE
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END

Oracle - grouping by category, range of dates

You are trying to find adjacent values. One method uses a difference of row numbers:

select min(start_date) as start_date, max(start_date) as end_date, color
from (select t.*,
row_number() over (order by start_date) as seqnum,
row_number() over (partition by color order by start_date) as seqnum_c
from t
group by (seqnum - seqnum_c), color;

It is a bit challenging to explain why the difference of row numbers works. I encourage you to run the subquery and to stare at the numbers. You should be able to see why the difference is constant for adjacent color values.

How to “group by” over a DATETIME range?

Thank you all for your answers, by taking a look to them I could write down the query I was searching for:

SELECT CASE 
WHEN EXTRACT(HOUR FROM TX.DATETIME) >= 5 THEN TO_CHAR(TX.DATETIME,'DD-MM-YYYY')
WHEN EXTRACT(HOUR FROM TX.DATETIME) BETWEEN 0 AND 2 THEN TO_CHAR(TX.DATETIME-1,'DD-MM-YYYY')
WHEN EXTRACT(hour from tx.datetime) between 2 and 5 THEN to_char(TX.DATETIME-1,'DD-MM-YYYY')
END AS age,
NVL(SUM(tx.amount),0) AS sales
FROM TRANSACTION TX
WHERE tx.datetime > to_date('20100801 08:59:59', 'yyyymmdd hh24:mi:ss')
AND TX.DATETIME < TO_DATE('20100901 09:00:00', 'yyyymmdd hh24:mi:ss')
GROUP BY CASE
WHEN EXTRACT(HOUR FROM TX.DATETIME) >= 5 THEN TO_CHAR(TX.DATETIME,'DD-MM-YYYY')
WHEN EXTRACT(HOUR FROM TX.DATETIME) BETWEEN 0 AND 2 THEN TO_CHAR(TX.DATETIME-1,'DD-MM-YYYY')
WHEN EXTRACT(hour from tx.datetime) between 2 and 5 THEN to_char(TX.DATETIME-1,'DD-MM-YYYY')
END
ORDER BY 1

Oracle SQL Group Over Integer Range

Huh? Just do a join and group by:

SELECT gl.GRADE, COUNT(*) AS STUDENT_COUNT
FROM STUDENT_MARKS sm JOIN
GRADE_LOOKUP gl
ON sm.student_mark BETWEEN gl.LOWER_MAKR and gl.UPPER_MARK
GROUP BY gl.GRAdE
ORDER BY gl.GRADE;

is there a way to group by using values in a range in sql

If you just want the counts, then you don't even need to group, you can just use conditional aggregation:

with  -- For testing only; remove and use actual table name in SELECT statement
your_table (ref_num, tran_amt) as (
select 1612, 2500 from dual union all
select 1613, 51800000 from dual union all
select 1614, 2170000 from dual union all
select 1615, 100 from dual union all
select 1616, 2442876.5 from dual union all
select 1617, 25000 from dual union all
select 1618, 250 from dual union all
select 1619, 7000 from dual union all
select 1610, 51500 from dual union all
select 1621, 15000 from dual union all
select 1622, 20 from dual
)
select count (case when tran_amt <= 5000 then 1 end) as amt_to_5000,
count (case when tran_amt > 5000
and tran_amt <= 50000 then 1 end) as amt_from_5000_to_50000,
count (case when tran_amt > 50000 then 1 end) as amt_over_50000
from your_table
;

AMT_TO_5000 AMT_FROM_5000_TO_50000 AMT_OVER_50000
----------- ---------------------- --------------
4 3 4

Note how I use non-strict and strict inequalities, not between. With between, you would miss amounts like 5000.83 - they would not be counted anywhere.

How to group a series of numbers in oracle sql, with a specific number provided

If you need the range representation as shown in your example:

-- test data
with data(empid) as
( -- complete list 1..42
select level
from dual
connect by level <= 42
minus (
-- minus gaps
select level
from dual
where level between 11 and 13
connect by level <= 42
union
select level
from dual
where level between 19 and 30
connect by level <= 42
union
select level
from dual
where level between 37 and 40
connect by level <= 42))

-- select:
select listagg(case
when minempid = maxempid then
minempid || ' '
else
(minempid || '-' || maxempid)
end,
', ') within group(order by minempid),
sum(cnt)
from (select grp,
seq,
min(empid) as minempid,
max(empid) as maxempid,
count(*) cnt
from (select empid, rn, empid - rn as seq, ceil(rn / 7) as grp
from (select empid, row_number() over(order by empid) rn
from data))
group by grp, seq)
group by grp;

Grouping based on value by date range

Tabibitosan handles this very easily:

WITH your_table AS (SELECT to_date('01/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('02/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('03/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('04/01/2017', 'dd/mm/yyyy') daytime, 35000 VALUE FROM dual UNION ALL
SELECT to_date('05/01/2017', 'dd/mm/yyyy') daytime, 35000 VALUE FROM dual UNION ALL
SELECT to_date('06/01/2017', 'dd/mm/yyyy') daytime, 40000 VALUE FROM dual UNION ALL
SELECT to_date('07/01/2017', 'dd/mm/yyyy') daytime, 40000 VALUE FROM dual UNION ALL
SELECT to_date('08/01/2017', 'dd/mm/yyyy') daytime, 50000 VALUE FROM dual UNION ALL
SELECT to_date('09/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual)
-- end of mimicking your table with data in it. See SQL below:
SELECT MIN(daytime) fromdate,
MAX(daytime) todate,
VALUE
FROM (SELECT daytime,
VALUE,
row_number() OVER (ORDER BY daytime) - row_number() OVER (PARTITION BY VALUE ORDER BY daytime) grp
FROM your_table)
GROUP BY grp,
VALUE
ORDER BY MIN(daytime);

FROMDATE TODATE VALUE
---------- ---------- ----------
01/01/2017 03/01/2017 20000
04/01/2017 05/01/2017 35000
06/01/2017 07/01/2017 40000
08/01/2017 08/01/2017 50000
09/01/2017 09/01/2017 20000

What this does is compare the row number for all the rows ordered by date, and then the row number for all the rows for each value ordered by date. If the value rows are consecutive in the main set of data, then the difference between the two sets of data remains the same, so you can then group by that. If there is a gap, then the difference increases.

In your example above, the first three rows for value = 20000 happen to be the first three rows of the whole set, so the difference will be 0. However the fourth value = 20000 row is the 9th row in the whole set, so the difference is now 5. You can easily see that the value of 20000 falls into two groups, and as such, you can find the min/max daytime for each group separately by including that difference calculation in the group by clause.

N.B. This does assume that the dates in your data are consecutive or that if there are missing dates that you assume the value stays the same for the missing dates. If you do have missing days and you want the values across a gap to show in different groups, you'd need to outer join to a subquery that contains the missing dates. In that case, I think GurV's answer (with the additional clause in the case statement that I mentioned in the comments) would be the best one to use, as that would avoid the need to outer join to a list of consecutive dates.

group by on a range

You might employ case statement to get counts of exclusive ranges:

select case when [vendor experience] <= 6 then '0-6'
when [vendor experience] <= 12 then '0-12'
when [vendor experience] <= 18 then '0-18'
else 'more'
end [vendor_experience(months)],
count (*) [count]
from experiences
group by
case when [vendor experience] <= 6 then '0-6'
when [vendor experience] <= 12 then '0-12'
when [vendor experience] <= 18 then '0-18'
else 'more'
end

This produces the same result as yours (inclusive ranges):

; with ranges as 
(
select 6 as val, 0 as count_all
union all
select 12, 0
union all
select 18, 0
union all
select 0, 1
)
select case when ranges.count_all = 1
then 'more'
else '0-' + convert (varchar(10), ranges.val)
end [vendor_experience(months)],
sum (case when ranges.count_all = 1
or experiences.[vendor experience] <= ranges.val
then 1 end) [count]
from experiences
cross join ranges
group by ranges.val, ranges.count_all

count_all is set to 1 to mark open-ending range.

Sql Fiddle is here.

UPDATE: an attempt at explanation.

The first part starting with with and ending with closing bracket is called CTE. Sometimes it is referred to as inline view because it can be used multiple times in the same query and under some circumstances is updateable. Here it is used to prepare data for ranges and is appropriately named ranges. This name one uses in main query. Val is maximum value of a range, count_all is 1 if range has no upper end (18+, more, or however you wish to call it). Data rows are combined by means of union all. You might copy/paste section between parenthesis only and run it just to see the results.

Main body joins experiences table with ranges using cross join. This creates combinations of all rows from experiences and ranges. For row d 11 there will be 4 rows,

empname vendor experience val count_all
d 11 6 0
d 11 12 0
d 11 18 0
d 11 0 1

First case statement in select list produces caption by checking count_all - if it is one, outputs more, else constructs caption using upper range value. Second case statement counts using sum(1). As aggregate functions ignore nulls, and case having no else evaluates to null if match was not found, it is sufficient to check if count_all is true (meaning that this row from experiences is counted in this range) or if vendor experience is less or equal to upper range value of current range. In example above 11 will not be counted for first range but will be counted for all the rest.

Results are then grouped by val and count_all. To better see how it works you might remove group by and sum() and look at numbers before aggregation. Order by empname, val will help to see how values of [count] change depending on different val per an employee.

Note: I did my best with my current level of english language. Please don't hesitate to ask for clarification if you need one (or two, or as many as you need).



Related Topics



Leave a reply



Submit