How to Fill Missing Dates by Groups in a Table in SQL

How to fill missing dates by groups in a table in sql

You can do it like this without loops

SELECT p.date, COALESCE(a.value, 0) value, p.grp_no
  FROM
(
  SELECT grp_no, date
    FROM
  (
    SELECT grp_no, MIN(date) min_date, MAX(date) max_date
      FROM tableA
     GROUP BY grp_no
  ) q CROSS JOIN tableb b 
   WHERE b.date BETWEEN q.min_date AND q.max_date
) p LEFT JOIN TableA a
    ON p.grp_no = a.grp_no 
   AND p.date = a.date

The innermost subquery grabs min and max dates per group. Then cross join with TableB produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA and fills value column with 0 for dates that are missing in TableA.

Output:


|       DATE | VALUE | GRP_NO |
|------------|-------|--------|
| 2012-08-06 |     1 |      1 |
| 2012-08-07 |     0 |      1 |
| 2012-08-08 |     1 |      1 |
| 2012-08-09 |     0 |      1 |
| 2012-08-07 |     2 |      2 |
| 2012-08-08 |     1 |      2 |
| 2012-08-09 |     0 |      2 |
| 2012-08-10 |     0 |      2 |
| 2012-08-11 |     0 |      2 |
| 2012-08-12 |     3 |      2 |

Here is SQLFiddle demo

How to fill in missing dates

Here is a query that would work. Start by cross joining all combinations of dates and users (add filters as needed), then left join the users table and calculate quota using the last_value() function (note that if you are using Snowflake, you must specify "rows between unbounded preceding and current row" as documented here):

with all_dates_users as (
--all combinations of dates and users
select date, user
from dates
cross join (select distinct user_email as user from users)
),
joined as (
--left join users table to the previous
select DU.date, DU.user, U.sent_at, U.user_email, U.score, U.quota
from all_dates_users DU
left join users U on U.sent_at = DU.date and U.user_email = DU.user
)
--calculate quota as previous quota using last_value() function
select date, user, nvl(score, 0) as score, last_value(quota) ignore nulls over (partition by user order by date desc rows between unbounded preceding and current row) as quota
from joined
order by date desc;

SQL: Fill missing dates for each group

Use a CROSS JOIN to generate the rows, then bring in the values. In this case OUTER APPLY might be the simplest solution:

select c.[Date], m.Materialnumber, d.Amount
from KSH_calendar c cross join
     (select distinct d.Materialnumber
      from database d
     ) m outer apply
     (select top (1) d.*
      from database d
      where d.Materialnumber = m.Materialnumber and
            c.date <= d.MKPF_CPUDT
      order by d.date desc
     ) d;

If you have a lot of dates, then an index on database(materialnumber, MKPF_CPUDT) will help. But there are alternative methods that are a little more complicated. I would recommend that you ask another question if performance is an issue.

Query for how to add the missing dates in sql

Form a Date Calender with a start and end date range and perform a left join with your table to get the needed result.

e.g.

DECLARE @t TABLE(Dt Datetime, Value VARCHAR(20) NULL)
INSERT INTO @t VALUES
('05/28/2012',NULL),
('05/29/2012',NULL),
('05/30/2012',NULL),('05/30/2012','Break In'),('05/30/2012','Break Out'),
('05/31/2012',NULL),
('06/03/2012',NULL),('06/03/2012','Break In'),('06/03/2012','Break Out'),('06/03/2012','In Duty'),('06/03/2012','Out Duty'),
('06/04/2012',NULL),('06/04/2012','In Duty'),('06/04/2012','Out Duty'),
('06/05/2012',NULL),('06/05/2012','Break In'),('06/05/2012','Break Out'),
('06/06/2012',NULL),('06/06/2012','Break In'),('06/06/2012','Break Out'),('06/06/2012','In Duty'),('06/06/2012','Out Duty'),
('06/07/2012',NULL),('06/07/2012','In Duty'),('06/07/2012','Out Duty'),
('06/10/2012',NULL),('06/10/2012','Break Out'),('06/10/2012','In Duty'),('06/10/2012','Out Duty'),
('06/11/2012',NULL),('06/11/2012','In Duty'),('06/11/2012','Out Duty'),
('06/12/2012',NULL),
('06/13/2012',NULL),
('06/14/2012',NULL)

DECLARE @startDate DATETIME, @endDate DATETIME
SELECT @startDate = '2012-05-28', @endDate = '2012-06-14' --yyyy-mm-dd
;WITH Calender AS (
    SELECT @startDate AS CalanderDate
    UNION ALL
    SELECT CalanderDate + 1 FROM Calender
    WHERE CalanderDate + 1 <= @endDate
)
SELECT 
    [Date] = Convert(VARCHAR(10),CalanderDate,101)
    ,Value
FROM Calender c
LEFT JOIN @t t 
ON t.Dt = c.CalanderDate

Result

Date    Value
05/28/2012  NULL
05/29/2012  NULL
05/30/2012  NULL
05/30/2012  Break In
05/30/2012  Break Out
05/31/2012  NULL
06/01/2012  NULL
06/02/2012  NULL
06/03/2012  NULL
06/03/2012  Break In
06/03/2012  Break Out
06/03/2012  In Duty
06/03/2012  Out Duty
06/04/2012  NULL
06/04/2012  In Duty
06/04/2012  Out Duty
06/05/2012  NULL
06/05/2012  Break In
06/05/2012  Break Out
06/06/2012  NULL
06/06/2012  Break In
06/06/2012  Break Out
06/06/2012  In Duty
06/06/2012  Out Duty
06/07/2012  NULL
06/07/2012  In Duty
06/07/2012  Out Duty
06/08/2012  NULL
06/09/2012  NULL
06/10/2012  NULL
06/10/2012  Break Out
06/10/2012  In Duty
06/10/2012  Out Duty
06/11/2012  NULL
06/11/2012  In Duty
06/11/2012  Out Duty
06/12/2012  NULL
06/13/2012  NULL
06/14/2012  NULL

Hope this helps

How to add in missing dates as rows in table

Consider below approach

select date(Ingestion_Time) Ingestion_Time, Rows_Written 
from your_current_query union all
select day, 0 from (
  select *, lead(Ingestion_Time) over(order by Ingestion_Time) next_time 
  from your_current_query
), unnest(generate_date_array(date(Ingestion_Time) + 1, date(next_time) - 1)) day

if to apply to sample data in your question - output is

Sample Image

Filling missing dates in each group while querying data from PostgreSQL

You may cross join with a table which contains all types, and then use the same left join approach you were already considering:

SELECT
    date_trunc('day', cal)::date AS date,
    t1.type,
    t2.value
FROM generate_series
    ( '2020-01-01'::timestamp 
    , '2020-12-31'::timestamp
    , '1 day'::interval) cal
CROSS JOIN (SELECT DISTINCT type FROM yourTable) t1
LEFT JOIN yourTable t2
    ON t2.date = cal.date AND t2.type = t1.type
ORDER BY
    t1.type,
    cal.date;

How to fill missing values for missing dates with value from date before in sql bigquery?

Consider below:

WITH days_by_id AS (
  SELECT id, GENERATE_DATE_ARRAY(MIN(date), MAX(date)) days
    FROM sample
   GROUP BY id
)
SELECT date, id, 
       IFNULL(price, LAST_VALUE(price IGNORE NULLS) OVER (PARTITION BY id ORDER BY date)) AS price
  FROM days_by_id, UNNEST(days) date LEFT JOIN sample USING (id, date);

output :

Sample Image

How do I fill in missing dates by group in Oracle with changing count value

with start_params as (
    select
        to_date('01/01/2020', 'MM/DD/YYYY') as start_date,
        60 numdays
    from dual
),
colors as (
    select to_date('1/28/2020 09:29', 'MM/DD/YYYY HH24:MI') as color_date, 'red' as color,  1 color_count from dual union
    select to_date('2/3/2020 07:04', 'MM/DD/YYYY HH24:MI') as color_date,  'red' as color,  5 color_count from dual union
    select to_date('2/6/2020 12:11', 'MM/DD/YYYY HH24:MI') as color_date,  'red' as color,  11 color_count from dual union
    select to_date('2/11/2020 17:15', 'MM/DD/YYYY HH24:MI') as color_date,  'red' as color,  4 color_count from dual union
    select to_date('2/15/2020 03:46', 'MM/DD/YYYY HH24:MI') as color_date,  'red' as color,  6 color_count from dual union
    select to_date('1/16/2020 14:52', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 7 color_count from dual union
    select to_date('1/19/2020 22:30', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 32 color_count from dual union
    select to_date('1/23/2020 05:17', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 16 color_count from dual union
    select to_date('1/28/2020 18:35', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 24 color_count from dual union
    select to_date('1/31/2020 15:38', 'MM/DD/YYYY HH24:MI') as color_date, 'blue' as color, 41 color_count from dual union
    select to_date('2/2/2020 16:01', 'MM/DD/YYYY HH24:MI') as color_date,  'blue' as color, 11 color_count from dual
),
upd_colors as (
    select
        (select start_date from start_params) color_date,
        color,
        min(color_count) keep(dense_rank first order by color_date) color_count
    from colors
    group by color
    union
    select trunc(color_date), color, color_count from colors
),
dates as (
    select dat, color
    from (
        select start_date + numtodsinterval(level-1, 'DAY') dat
        from start_params connect by level <= numdays
    ), (select distinct color from colors)
)
select d.dat, d.color,
    nvl(c.color_count, lag(c.color_count ignore nulls) over (partition by d.color order by d.dat)) color_count
from dates d, upd_colors c
where c.color_date(+) = d.dat
  and c.color(+) = d.color
order by color, dat

fiddle

Fill in missing dates across multiple partitions (Snowflake)

WITH fake_data AS (
    SELECT * FROM VALUES
        ('A','USD','2020-01-01'::date,3)
        ,('A','USD','2020-01-03'::date,4)
        ,('A','USD','2020-01-04'::date,2)
        ,('A','CAD','2021-01-04'::date,5)
        ,('A','CAD','2021-01-06'::date,6)
        ,('A','CAD','2020-01-07'::date,1)
        ,('B','USD','2019-01-01'::date,3)
        ,('B','USD','2019-01-03'::date,4)
        ,('B','USD','2019-01-04'::date,5)
        ,('B','CAD','2017-01-04'::date,3)
        ,('B','CAD','2017-01-06'::date,2)
        ,('B','CAD','2017-01-07'::date,2)
    d(Name,Currency,Date,Amount)
), partition_ranges AS (
    SELECT name,
        currency, 
        min(date) as min_date, 
        max(date) as max_date,
        datediff('days', min_date, max_date) as span
    FROM fake_data
    GROUP BY 1,2
), huge_range as (
    SELECT ROW_NUMBER() OVER(order by true)-1 as rn
    FROM table(generator(ROWCOUNT => 10000000))
), in_fill as (
    SELECT pr.name,
        pr.currency,
        dateadd('day', hr.rn, pr.min_date) as date
    FROM partition_ranges as pr
    JOIN huge_range as hr ON pr.span >= hr.rn
)
SELECT 
    i.name, 
    i.currency, 
    i.date,
    nvl(d.amount, 0) as amount
from in_fill as i
left join fake_data as d on d.name = i.name and d.currency = i.currency and d.date = i.date
order by 1,2,3;

NAME	CURRENCY	DATE	AMOUNT
A	CAD	2020-01-07	1
A	CAD	2020-01-08	0
A	CAD	2020-01-09	0
A	CAD	2020-01-10	0
A	CAD

How to add missing dates when calculating count on a table

One option uses a recursive query to generate the dates. you can then cross join that with the list of distinct items available in the table, and bring the table with a left join. The last step is aggregation:

with cte as (
    select min(convert(date, saletime)) as dt, max(convert(date, saletime)) as max_dt from mytable
    union all
    select dateadd(day, 1, dt), max_dt from cte where dt < max_dt
)
select c.dt, i.itemid, count(t.id) as sale_count
from cte c
cross join (select distinct itemid from mytable) i
left join mytable t 
    on  t.itemid = i.itemid
    and t.date >= c.dt
    and t.date <  dateadd(day, 1, c.dt)
group by c.dt, i.itemid

In a real life situation, you would probably have a separate referential table to store the items, that you would use instead of the select distinct subquery.

"""

How to Fill Missing Dates by Groups in a Table in SQL