How to fill missing dates by groups in a table in sql
You can do it like this without loops
SELECT p.date, COALESCE(a.value, 0) value, p.grp_no
FROM
(
SELECT grp_no, date
FROM
(
SELECT grp_no, MIN(date) min_date, MAX(date) max_date
FROM tableA
GROUP BY grp_no
) q CROSS JOIN tableb b
WHERE b.date BETWEEN q.min_date AND q.max_date
) p LEFT JOIN TableA a
ON p.grp_no = a.grp_no
AND p.date = a.date
The innermost subquery grabs min and max dates per group. Then cross join with TableB
produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA
and fills value
column with 0
for dates that are missing in TableA
.
Output:
| DATE | VALUE | GRP_NO |
|------------|-------|--------|
| 2012-08-06 | 1 | 1 |
| 2012-08-07 | 0 | 1 |
| 2012-08-08 | 1 | 1 |
| 2012-08-09 | 0 | 1 |
| 2012-08-07 | 2 | 2 |
| 2012-08-08 | 1 | 2 |
| 2012-08-09 | 0 | 2 |
| 2012-08-10 | 0 | 2 |
| 2012-08-11 | 0 | 2 |
| 2012-08-12 | 3 | 2 |
Here is SQLFiddle demo
How to fill in missing dates
Here is a query that would work. Start by cross joining all combinations of dates and users (add filters as needed), then left join the users table and calculate quota using the last_value() function (note that if you are using Snowflake, you must specify "rows between unbounded preceding and current row" as documented here):
with all_dates_users as (
--all combinations of dates and users
select date, user
from dates
cross join (select distinct user_email as user from users)
),
joined as (
--left join users table to the previous
select DU.date, DU.user, U.sent_at, U.user_email, U.score, U.quota
from all_dates_users DU
left join users U on U.sent_at = DU.date and U.user_email = DU.user
)
--calculate quota as previous quota using last_value() function
select date, user, nvl(score, 0) as score, last_value(quota) ignore nulls over (partition by user order by date desc rows between unbounded preceding and current row) as quota
from joined
order by date desc;
SQL: Fill missing dates for each group
Use a CROSS JOIN
to generate the rows, then bring in the values. In this case OUTER APPLY
might be the simplest solution:
select c.[Date], m.Materialnumber, d.Amount
from KSH_calendar c cross join
(select distinct d.Materialnumber
from database d
) m outer apply
(select top (1) d.*
from database d
where d.Materialnumber = m.Materialnumber and
c.date <= d.MKPF_CPUDT
order by d.date desc
) d;
If you have a lot of dates, then an index on database(materialnumber, MKPF_CPUDT)
will help. But there are alternative methods that are a little more complicated. I would recommend that you ask another question if performance is an issue.
Query for how to add the missing dates in sql
Form a Date Calender with a start and end date range and perform a left join with your table to get the needed result.
e.g.
DECLARE @t TABLE(Dt Datetime, Value VARCHAR(20) NULL)
INSERT INTO @t VALUES
('05/28/2012',NULL),
('05/29/2012',NULL),
('05/30/2012',NULL),('05/30/2012','Break In'),('05/30/2012','Break Out'),
('05/31/2012',NULL),
('06/03/2012',NULL),('06/03/2012','Break In'),('06/03/2012','Break Out'),('06/03/2012','In Duty'),('06/03/2012','Out Duty'),
('06/04/2012',NULL),('06/04/2012','In Duty'),('06/04/2012','Out Duty'),
('06/05/2012',NULL),('06/05/2012','Break In'),('06/05/2012','Break Out'),
('06/06/2012',NULL),('06/06/2012','Break In'),('06/06/2012','Break Out'),('06/06/2012','In Duty'),('06/06/2012','Out Duty'),
('06/07/2012',NULL),('06/07/2012','In Duty'),('06/07/2012','Out Duty'),
('06/10/2012',NULL),('06/10/2012','Break Out'),('06/10/2012','In Duty'),('06/10/2012','Out Duty'),
('06/11/2012',NULL),('06/11/2012','In Duty'),('06/11/2012','Out Duty'),
('06/12/2012',NULL),
('06/13/2012',NULL),
('06/14/2012',NULL)
DECLARE @startDate DATETIME, @endDate DATETIME
SELECT @startDate = '2012-05-28', @endDate = '2012-06-14' --yyyy-mm-dd
;WITH Calender AS (
SELECT @startDate AS CalanderDate
UNION ALL
SELECT CalanderDate + 1 FROM Calender
WHERE CalanderDate + 1 <= @endDate
)
SELECT
[Date] = Convert(VARCHAR(10),CalanderDate,101)
,Value
FROM Calender c
LEFT JOIN @t t
ON t.Dt = c.CalanderDate
Result
Date Value
05/28/2012 NULL
05/29/2012 NULL
05/30/2012 NULL
05/30/2012 Break In
05/30/2012 Break Out
05/31/2012 NULL
06/01/2012 NULL
06/02/2012 NULL
06/03/2012 NULL
06/03/2012 Break In
06/03/2012 Break Out
06/03/2012 In Duty
06/03/2012 Out Duty
06/04/2012 NULL
06/04/2012 In Duty
06/04/2012 Out Duty
06/05/2012 NULL
06/05/2012 Break In
06/05/2012 Break Out
06/06/2012 NULL
06/06/2012 Break In
06/06/2012 Break Out
06/06/2012 In Duty
06/06/2012 Out Duty
06/07/2012 NULL
06/07/2012 In Duty
06/07/2012 Out Duty
06/08/2012 NULL
06/09/2012 NULL
06/10/2012 NULL
06/10/2012 Break Out
06/10/2012 In Duty
06/10/2012 Out Duty
06/11/2012 NULL
06/11/2012 In Duty
06/11/2012 Out Duty
06/12/2012 NULL
06/13/2012 NULL
06/14/2012 NULL
Hope this helps
How to add in missing dates as rows in table
Consider below approach
select date(Ingestion_Time) Ingestion_Time, Rows_Written
from your_current_query union all
select day, 0 from (
select *, lead(Ingestion_Time) over(order by Ingestion_Time) next_time
from your_current_query
), unnest(generate_date_array(date(Ingestion_Time) + 1, date(next_time) - 1)) day
if to apply to sample data in your question - output is
Filling missing dates in each group while querying data from PostgreSQL
You may cross join with a table which contains all types, and then use the same left join approach you were already considering:
SELECT
date_trunc('day', cal)::date AS date,
t1.type,
t2.value
FROM generate_series
( '2020-01-01'::timestamp
, '2020-12-31'::timestamp
, '1 day'::interval) cal
CROSS JOIN (SELECT DISTINCT type FROM yourTable) t1
LEFT JOIN yourTable t2
ON t2.date = cal.date AND t2.type = t1.type
ORDER BY
t1.type,
cal.date;
How to fill missing values for missing dates with value from date before in sql bigquery?
Consider below:
WITH days_by_id AS (
SELECT id, GENERATE_DATE_ARRAY(MIN(date), MAX(date)) days
FROM sample
GROUP BY id
)
SELECT date, id,
IFNULL(price, LAST_VALUE(price IGNORE NULLS) OVER (PARTITION BY id ORDER BY date)) AS price
FROM days_by_id, UNNEST(days) date LEFT JOIN sample USING (id, date);
output :
How do I fill in missing dates by group in Oracle with changing count value
with start_params as (
select
to_date('01/01/2020', 'MM/DD/YYYY') as start_date,
60 numdays
from dual
),
colors as (
select to_date('1/28/2020 09:29', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color, 1 color_count from dual union
select to_date('2/3/2020 07:04', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color, 5 color_count from dual union
select to_date('2/6/2020 12:11', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color, 11 color_count from dual union
select to_date('2/11/2020 17:15', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color, 4 color_count from dual union
select to_date('2/15/2020 03:46', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color, 6 color_count from dual union
select to_date('1/16/2020 14:52', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 7 color_count from dual union
select to_date('1/19/2020 22:30', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 32 color_count from dual union
select to_date('1/23/2020 05:17', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 16 color_count from dual union
select to_date('1/28/2020 18:35', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 24 color_count from dual union
select to_date('1/31/2020 15:38', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 41 color_count from dual union
select to_date('2/2/2020 16:01', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 11 color_count from dual
),
upd_colors as (
select
(select start_date from start_params) color_date,
color,
min(color_count) keep(dense_rank first order by color_date) color_count
from colors
group by color
union
select trunc(color_date), color, color_count from colors
),
dates as (
select dat, color
from (
select start_date + numtodsinterval(level-1, 'DAY') dat
from start_params connect by level <= numdays
), (select distinct color from colors)
)
select d.dat, d.color,
nvl(c.color_count, lag(c.color_count ignore nulls) over (partition by d.color order by d.dat)) color_count
from dates d, upd_colors c
where c.color_date(+) = d.dat
and c.color(+) = d.color
order by color, dat
fiddle
Fill in missing dates across multiple partitions (Snowflake)
WITH fake_data AS (
SELECT * FROM VALUES
('A','USD','2020-01-01'::date,3)
,('A','USD','2020-01-03'::date,4)
,('A','USD','2020-01-04'::date,2)
,('A','CAD','2021-01-04'::date,5)
,('A','CAD','2021-01-06'::date,6)
,('A','CAD','2020-01-07'::date,1)
,('B','USD','2019-01-01'::date,3)
,('B','USD','2019-01-03'::date,4)
,('B','USD','2019-01-04'::date,5)
,('B','CAD','2017-01-04'::date,3)
,('B','CAD','2017-01-06'::date,2)
,('B','CAD','2017-01-07'::date,2)
d(Name,Currency,Date,Amount)
), partition_ranges AS (
SELECT name,
currency,
min(date) as min_date,
max(date) as max_date,
datediff('days', min_date, max_date) as span
FROM fake_data
GROUP BY 1,2
), huge_range as (
SELECT ROW_NUMBER() OVER(order by true)-1 as rn
FROM table(generator(ROWCOUNT => 10000000))
), in_fill as (
SELECT pr.name,
pr.currency,
dateadd('day', hr.rn, pr.min_date) as date
FROM partition_ranges as pr
JOIN huge_range as hr ON pr.span >= hr.rn
)
SELECT
i.name,
i.currency,
i.date,
nvl(d.amount, 0) as amount
from in_fill as i
left join fake_data as d on d.name = i.name and d.currency = i.currency and d.date = i.date
order by 1,2,3;
NAME | CURRENCY | DATE | AMOUNT |
---|---|---|---|
A | CAD | 2020-01-07 | 1 |
A | CAD | 2020-01-08 | 0 |
A | CAD | 2020-01-09 | 0 |
A | CAD | 2020-01-10 | 0 |
A | CAD |
How to add missing dates when calculating count on a table
One option uses a recursive query to generate the dates. you can then cross join
that with the list of distinct items available in the table, and bring the table with a left join
. The last step is aggregation:
with cte as (
select min(convert(date, saletime)) as dt, max(convert(date, saletime)) as max_dt from mytable
union all
select dateadd(day, 1, dt), max_dt from cte where dt < max_dt
)
select c.dt, i.itemid, count(t.id) as sale_count
from cte c
cross join (select distinct itemid from mytable) i
left join mytable t
on t.itemid = i.itemid
and t.date >= c.dt
and t.date < dateadd(day, 1, c.dt)
group by c.dt, i.itemid
In a real life situation, you would probably have a separate referential table to store the items, that you would use instead of the select distinct
subquery.
Related Topics
How to Migrate an Existing Postgres Table to Partitioned Table as Transparently as Possible
How to Select Bottom Most Rows
SQL Server Bitwise Processing Like C# Enum Flags
Trigger to Prevent Insertion for Duplicate Data of Two Columns
How to Check If Identity_Insert Is Set to on or Off in SQL Server
How to Tell If I Have Uncommitted Work in an Oracle Transaction
How to Add Offset in a "Select" Query in Oracle 11G
Looking for a SQL Transaction Log File Viewer
Postgresql Query to Count/Group by Day and Display Days with No Data
Differencebetween Rowsbetween and Rangebetween
SQL Statement with Multiple Sets and Wheres
Retrieve Id of Record Just Inserted into a Java Db (Derby) Database
Use String Contains Function in Oracle SQL Query
SQL Selecting Rows Where One Column's Value Is Common Across Another Criteria Column