SQL Server: group dates by ranges
You need to do something like this
select t.range as [score range], count(*) as [number of occurences]
from (
select case
when score between 0 and 9 then ' 0-9 '
when score between 10 and 19 then '10-19'
when score between 20 and 29 then '20-29'
...
else '90-99' end as range
from scores) t
group by t.range
Check this link In SQL, how can you "group by" in ranges?
How to group data based on continuous date range?
the date range for price is 23.9 is not right because price not same for all the days in that range.
Because there are two same price
in different overlapping date ranges, so you might get only one row when you used aggregate function.
This is a gap-and-island problem, we can try to use ROW_NUMBER
window function to get the gap of overlapping date and then group by
that.
SELECT Product_Code,
min(Pricing_Date) AS Min_Date ,
max(Pricing_Date) AS Max_Date,
price
FROM (
SELECT *,
ROW_NUMBER() OVER(ORDER BY PRICING_DATE) - ROW_NUMBER() OVER(PARTITION BY PRODUCT_CODE,PRICE ORDER BY PRICING_DATE) grp
FROM PRICE_DATA
) t1
GROUP BY grp,Product_Code,price
ORDER BY min(Pricing_Date)
sqlfiddle
Explain
The gap-and-island problem is a feature
continuous(overlapping) data is that a set
(continuous range of sequence) - (values based on a certain order of conditions sequence)
yields the same grouping.
so that We can use
ROW_NUMBER() OVER(ORDER BY PRICING_DATE)
making a continuous range of values.ROW_NUMBER() OVER(PARTITION BY PRODUCT_CODE,PRICE ORDER BY PRICING_DATE)
making values based on a certain order of conditions.
Then we will get a grouping column with overlapping data as sqlfiddle
SQL select data and grouping data by date range
If not using CTE, you can work the following query:
SELECT w1.price, w1.date, w2.date, w1.type FROM
(
SELECT * FROM mytable t1
WHERE NOT EXISTS (
SELECT 1 FROM mytable t2
WHERE
t1.price = t2.price AND
t1.type = t2.type AND
DATEDIFF(t2.date, t1.date) = -1
)
) w1
INNER JOIN
(
SELECT * FROM mytable t1
WHERE NOT EXISTS (
SELECT 1 FROM mytable t2
WHERE
t1.price = t2.price AND
t1.type = t2.type AND
DATEDIFF(t2.date, t1.date) = +1
)
) w2
ON
w1.price = w2.price AND
w1.type = w2.type AND
w1.date <= w2.date AND
NOT EXISTS (
SELECT * FROM mytable t1
WHERE NOT EXISTS (
SELECT 1 FROM mytable t2
WHERE
t1.price = t2.price AND
t1.type = t2.type AND
DATEDIFF(t2.date, t1.date) = +1
)
AND
w1.price = t1.price AND
w1.type = t1.type AND
w1.date <= t1.date AND t1.date < w2.date
)
- Getting the smaller and larger dates of each period.
- Joining these tables.
- Getting rows between smaller and larger dates.
DB Fiddle
Sql group query results by user id and date ranges dynamically
With all weeks starting on Monday, this would do it (efficiently):
SELECT id AS user_id, u."onboardedAt", u."closedAt"
, week_start, COALESCE(t.tx_count, 0) AS tx_count, a.last_user_action
FROM "Users" u
CROSS JOIN generate_series(date_trunc('week', u."onboardedAt"), u."closedAt", interval '1 week') AS week_start
LEFT JOIN (
SELECT "userId" AS id, date_trunc('week', t."createdAt") AS week_start, count(*) AS tx_count
FROM "Transactions" t
GROUP BY 1, 2
) t USING (id, week_start)
LEFT JOIN (
SELECT DISTINCT ON (1, 2)
"userId" AS id, date_trunc('week', a."createdAt") AS week_start, action AS last_user_action
FROM "UserActions" a
ORDER BY 1, 2, "createdAt" DESC
) a USING (id, week_start)
ORDER BY id, week_start;
db<>fiddle here
Working with standard weeks makes everything much simpler. We can aggregate in the "many" tables before joining, which is simpler and cheaper. Else, multiple joins can go wrong quickly. See:
- Two SQL LEFT JOINS produce incorrect result
Standard weeks make it easier to compare data, too. (Note that first and last week per user can be truncated (span fewer days). But that applies to the last week per user in any case.)
The LATERAL
keyword is assumed automatically in a join to a set-returning function:
CROSS JOIN generate_series(...)
See:
- What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
Using DISTINCT ON
to get the last_user_action
per user. See:
- Select first row in each GROUP BY group?
I advise to user legal, lower-case identifiers, so double-quoting is not required. Makes your life with Postgres easier.
Use last non-null action
Added in a comment:
if action is null in a current week, I want to grab most recent from previous weeks
SELECT user_id, "onboardedAt", "closedAt", week_start, tx_count
, last_user_action AS last_user_action_with_null
, COALESCE(last_user_action, max(last_user_action) OVER (PARTITION BY user_id, null_grp)) AS last_user_action
FROM (
SELECT id AS user_id, u."onboardedAt", u."closedAt"
, week_start, COALESCE(t.tx_count, 0) AS tx_count, a.last_user_action
, count(a.last_user_action) OVER (PARTITION BY id ORDER BY week_start) AS null_grp
FROM "Users" u
CROSS JOIN generate_series(date_trunc('week', u."onboardedAt"), u."closedAt", interval '1 week') AS week_start
LEFT JOIN (
SELECT "userId" AS id, date_trunc('week', t."createdAt") AS week_start, count(*) AS tx_count
FROM "Transactions" t
GROUP BY 1, 2
) t USING (id, week_start)
LEFT JOIN (
SELECT DISTINCT ON (1, 2)
"userId" AS id, date_trunc('week', a."createdAt") AS week_start, action AS last_user_action
FROM "UserActions" a
ORDER BY 1, 2, "createdAt" DESC
) a USING (id, week_start)
) sub
ORDER BY user_id, week_start;
db<>fiddle here
Explanation:
- Retrieve last known value for each column of a row
Group by predefined date range
You are close. To define your tables' relationships in your FROM
clause you don't want a WHERE
clause, you want an ON
clause:
SELECT t1.*, SUM(t2.boolean) as count
FROM table1 t1
LEFT JOIN table2 t2
ON t2.Date BETWEEN t1.period AND DATEADD(month, 1, t1.period)
Furthermore, because you are aggregating with a SUM()
in your SELECT you will need to provide a GROUP BY
to tell the database which columns to group on (every column that isn't being aggregated with a function like SUM()
):
SELECT t1.*, SUM(t2.boolean) as count
FROM table1 t1
LEFT JOIN table2 t2
ON t2.Date BETWEEN t1.period AND DATEADD(month, 1, t1.period)
GROUP BY t1.period, t1.value1, t1.value2
Related Topics
When Should You Consider Indexing Your SQL Tables
How to Find Fifth Highest Salary in a Single Query in SQL Server
Datename(Month,Getadate()) Is Returning Numeric Value of the Month as '09'
Sum of Digits of a Number in SQL Server Without Using Traditional Loops Like While
Oracle/Sql: Wm_Concat & Order By
Returning the Value of Identity Column After Insertion in Oracle
Join Tables on Nearest Date in the Past, in MySQL
Select Single Row from Child Table for Each Row in Parent Table
Oracle Equivalent of Rowlock, Updlock, Readpast Query Hints
Generic SQL That Both Access and Odbc/Oracle Can Understand
SQL Select Rows with Max and Min Date
Why Do SQL Id Sequences Go Out of Sync (Specifically Using Postgres)