Looping a SQL Insert Statement with Dates

While loop in SQL Server 2008 iterating through a date-range and then INSERT

SQL is a set based language and loops should be a last resort. So the set based approach would be to first generate all the dates you require and insert them in one go, rather than looping and inserting one at a time. Aaron Bertrand has written a great series on generating a set or sequence without loops:

  • Generate a set or sequence without loops – part 1
  • Generate a set or sequence without loops – part 2
  • Generate a set or sequence without loops – part 3

Part 3 is specifically relevant as it deals with dates.

Assuming you don't have a Calendar table you can use the stacked CTE method to generate a list of dates between your start and end dates.

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2)
SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
FROM N3;

I have skipped some detail on how this works as it is covered in the linked article, in essence it starts with a hard coded table of 10 rows, then joins this table with itself to get 100 rows (10 x 10) then joins this table of 100 rows to itself to get 10,000 rows (I stopped at this point but if you require further rows you can add further joins).

At each step the output is a single column called N with a value of 1 (to keep things simple). At the same time as defining how to generate 10,000 rows, I actually tell SQL Server to only generate the number needed by using TOP and the difference between your start and end date - TOP(DATEDIFF(DAY, @StartDate, @EndDate) + 1). This avoids unnecessary work. I had to add 1 to the difference to ensure both dates were included.

Using the ranking function ROW_NUMBER() I add an incremental number to each of the rows generated, then I add this incremental number to your start date to get the list of dates. Since ROW_NUMBER() begins at 1, I need to deduct 1 from this to ensure the start date is included.

Then it would just be a case of excluding dates that already exist using NOT EXISTS. I have enclosed the results of the above query in their own CTE called dates:

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
Dates AS
( SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
FROM N3
)
INSERT INTO MyTable ([TimeStamp])
SELECT Date
FROM Dates AS d
WHERE NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE d.Date = t.[TimeStamp])

Example on SQL Fiddle


If you were to create a calendar table (as described in the linked articles) then it may not be necessary to insert these extra rows, you could just generate your result set on the fly, something like:

SELECT  [Timestamp] = c.Date,
t.[FruitType],
t.[NumOffered],
t.[NumTaken],
t.[NumAbandoned],
t.[NumSpoiled]
FROM dbo.Calendar AS c
LEFT JOIN dbo.MyTable AS t
ON t.[Timestamp] = c.[Date]
WHERE c.Date >= @StartDate
AND c.Date < @EndDate;

ADDENDUM

To answer your actual question your loop would be written as follows:

DECLARE @StartDate AS DATETIME
DECLARE @EndDate AS DATETIME
DECLARE @CurrentDate AS DATETIME

SET @StartDate = '2015-01-01'
SET @EndDate = GETDATE()
SET @CurrentDate = @StartDate

WHILE (@CurrentDate < @EndDate)
BEGIN
IF NOT EXISTS (SELECT 1 FROM myTable WHERE myTable.Timestamp = @CurrentDate)
BEGIN
INSERT INTO MyTable ([Timestamp])
VALUES (@CurrentDate);
END

SET @CurrentDate = DATEADD(DAY, 1, @CurrentDate); /*increment current date*/
END

Example on SQL Fiddle

I do not advocate this approach, just because something is only being done once does not mean that I should not demonstrate the correct way of doing it.


FURTHER EXPLANATION

Since the stacked CTE method may have over complicated the set based approach I will simplify it by using the undocumented system table master..spt_values. If you run:

SELECT Number
FROM master..spt_values
WHERE Type = 'P';

You will see that you get all the numbers from 0 -2047.

Now if you run:

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P';

You get all the dates from your start date to 2047 days in the future. If you add a further where clause you can limit this to dates before your end date:

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate;

Now you have all the dates you need in a single set based query you can eliminate the rows that already exist in your table using NOT EXISTS

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate
AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));

Finally you can insert these dates into your table using INSERT

DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();

INSERT YourTable ([Timestamp])
SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate
AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));

Hopefully this goes some way to showing that the set based approach is not only much more efficient it is simpler too.

Loop through date range and insert when no data found

As a more concrete example of the MERGE approach already suggested:

PROCEDURE Insert_Data(Start_Date DATE, End_Date DATE)
IS
BEGIN
MERGE INTO calendar ca
USING (
SELECT Insert_Data.Start_Date + level - 1 as cal_date
FROM dual
CONNECT BY level <= Insert_Data.End_Date - Insert_Data.Start_Date + 1
) t
ON (t.cal_date = ca.cal_date)
WHEN NOT MATCHED THEN INSERT VALUES (t.cal_date);
END Insert_Data;

It doesn't need to be a procedure at all, but that seems to be a requirement on its own. You can just run the merge as plain SQL, using your date range directly instead of through variables. (Or as bind variables, depending on how/where you're running this).

The USING clause is a generated table that creates all of the dates in the supplied range, using a common CONNECT BY method. The LEVEL pseudocolumn is similar to the loop you're trying to do; overall the inner query generates all dates in your range as an inline view, which you can then use to check against the actual table. The rest of the statement only inserts new records from that range if they don't already exit.

You could also do the same thing manually, and less efficiently, with a NOT EXISTS check:

PROCEDURE Insert_Data(Start_Date DATE, End_Date DATE)
IS
BEGIN
INSERT INTO calendar
WITH t AS (
SELECT Insert_Data.Start_Date + level - 1 as cal_date
FROM dual
CONNECT BY level <= Insert_Data.End_Date - Insert_Data.Start_Date + 1
)
SELECT cal_date
FROM t
WHERE NOT EXISTS (
SELECT 1
FROM Calendar
WHERE Calendar.cal_date = t.cal_date
);
END Insert_Data;

SQL Fiddle.


You have a few other issues in your procedure.

This is redundant because of the form of cursor-for loop you're using:

  cal_v        calendar%rowtype;

You have an unnecessary nested block here; it doesn't hurt I suppose but it isn't adding anything either. The first BEGIN, DECLARE and the first END can be removed (and the alignment is a bit off):

  BEGIN  -- remove
DECLARE -- remove
CURSOR cal_c IS
SELECT *
FROM Calendar;
BEGIN
...
END; -- remove
END Insert_Data;

The outer loop, and the entire cursor, isn't needed; it actually means you're repeating the inner loop which actually does the work (or tries to, the first time anyway) as many times as there are existing records in the calendar table, which is pointless and slow:

      FOR cal_v IN calc_c LOOP
FOR date_v IN Insert_Data.Start_Date..Insert_Data.End_Date LOOP
...
END LOOP;
END LOOP;

The inner loop won't compile as you can't use dates for a range loop, only integers (giving PLS-00382):

       FOR date_v IN Insert_Data.Start_Date..Insert_Data.End_Date LOOP

The innermost select doesn't have an INTO; this won't compile either:

          SELECT * FROM calendar WHERE calc_v.calc_date = date_v;

The insert needs the value to be enclose in parentheses:

            INSERT INTO calendar VALUES date_v;

So if you really did want to do it this way you'd do something like:

PROCEDURE Insert_Data(Start_Date DATE, End_Date DATE)
IS
tmp_date DATE;
BEGIN
FOR i IN 0..(Insert_Data.End_Date - Insert_Data.Start_Date) LOOP
BEGIN
dbms_output.put_line(i);
SELECT cal_date INTO tmp_date FROM calendar
WHERE cal_date = Insert_Data.Start_Date + i;
EXCEPTION
WHEN NO_DATA_FOUND THEN
INSERT INTO calendar VALUES (Insert_Data.Start_Date + i);
END;
END LOOP;
END Insert_Data;

... but really, use merge.

How can I loop-insert incremental dates into MySQL?

Something like this will do it:-

INSERT INTO my_table (the_date)
SELECT ADDDATE('2013-04-13', INTERVAL SomeNumber DAY)
FROM (SELECT a.i+b.i*10+c.i*100+d.i*1000 AS SomeNumber FROM integers a, integers b, integers c, integers d) Sub1
WHERE SomeNumber BETWEEN 0 AND 1000

Relies on a table called integers with a single column called i with 10 rows, values from 0 to 9.

The Between clause is just there so you can limit the range of numbers to add to the date

How can INSERT INTO a table 300 times within a loop in SQL?

You may try it like this:

DECLARE @i int = 0
WHILE @i < 300
BEGIN
SET @i = @i + 1
/* your code*/
END

SQL fill next date (month) with loop

Paul, I'm assuming that you forgot to specify the month in your mock data.

I hope the code below may help you understand how non-trivial is what you are trying to accomplish :-) Kudos for your will to get rid of loops.

To make it better, I propose a denormalization (CAUTION!):

  • create another column price_valid_until
  • the latest prices records will have price_valid_until = '21000101' (aka, far away in the future)
  • when registering a new price, update the previous with new price_valid_from - 1 day

Here's the solution, with a pretty complex, but efficient query (http://sqlfiddle.com/#!18/4ab23/4)

create table price_history(
SKU varchar(255),
price_valid_from date,
price decimal(16, 2)
)

insert into price_history
values
('a', '20210101', 10),
('a', '20210107', 12),
('b', '20210102', 4),
('b', '20210110', 2),
('b', '20210214', 5);

-- This fiddler won't let me initialize and reference:
--
-- declare
-- @from_date date,
-- @to_date date;
--
-- select
-- @from_date = min(date_from),
-- @to_date = max(date_from)
-- from price_history

with
date_range as(
select
min(price_valid_from) as from_date,
--
eomonth(
max(price_valid_from)
) as to_date
from price_history
),
--
all_dates as(
select from_date as date_in_range
from date_range
-- ----------
union all
-- ----------
select dateadd(day, 1, date_in_range)
from all_dates
where
date_in_range < (
select to_date
from date_range
)
),
--
price_history_boundaries as(
select
ph.SKU,
ph.price,
--
ph.price_valid_from,
-- The latest price, so far, is valid until 01/01/2100
coalesce(
dateadd(
day,
-1,
min(ph_next.price_valid_from)
),
'21000101'
) as price_valid_until
from
price_history ph
left outer join price_history ph_next
on(
ph_next.SKU = ph.SKU
and ph_next.price_valid_from > ph.price_valid_from
)
group by ph.SKU, ph.price_valid_from, ph.price
)
select
phb.SKU,
ad.date_in_range,
phb.price
from
all_dates ad
inner join price_history_boundaries phb
on(
phb.price_valid_from <= ad.date_in_range
and phb.price_valid_until >= ad.date_in_range
)
order by phb.SKU, ad.date_in_range

MYSQL Procedure for while loop to add date

The first problem I noticed is dateStart needs to have a data type in the DECLARE statement. In this case it should be DATETIME. Also the dateStart is larger than the value in while statement. So the while loop never executes.

Try this instead:

DELIMITER //

CREATE PROCEDURE loop_date()
BEGIN
DECLARE dateStart DATETIME DEFAULT '2019-01-01 03:00:00';

WHILE (dateStart < '2020-01-01 00:00:00') DO
INSERT INTO my_table(time_key) VALUES (dateStart);
SET dateStart = dateStart + INTERVAL 3 HOUR;
END WHILE;
END //

DELIMITER ;

CALL loop_date();

For me doing 1 year difference took 9.57 seconds.

UPDATE: This was in response to your original question which has been changed.

Loop Insert Date a daily commit day to day timestamp ORACLE

As @boneist pointed out, your manipulation using numbers isn't going to work. You should keep the data type as it is and compare with values of the same data type.

Assuming you have a legitimate need to do this in a loop you could do something like this:

BEGIN
FOR r IN (
select date '2017-01-01' + level -1 as this_date
from dual
connect by level <= date '2018-01-01' - date '2017-01-01'
)
LOOP
INSERT INTO file_backup
SELECT *
FROM file_core
WHERE datecreated >= r.this_date
AND datecreated < r.this_date + 1;
COMMIT;
END LOOP;
END;
/

Or if the data type is actually a timestamp rather than a date as suggested in a comment, something like:

BEGIN
FOR r IN (
select timestamp '2017-01-01 00:00:00'
+ (level -1) * interval '1' day as this_timestamp
from dual
connect by level <= extract(day from timestamp '2018-01-01 00:00:00'
- timestamp '2017-01-01 00:00:00')
)
LOOP
INSERT INTO file_backup
SELECT *
FROM file_core
WHERE datecreated >= r.this_timestamp
AND datecreated < r.this_timestamp + interval '1' day;
COMMIT;
END LOOP;
END;
/

... though you might want to work on the condition for the connect-by query, e.g.

  FOR r IN (
select timestamp '2017-01-01 00:00:00'
+ numtodsinterval(level -1, 'DAY') as this_timestamp
from dual
connect by timestamp '2017-01-01 00:00:00'
+ numtodsinterval(level -1, 'DAY') < timestamp '2018-01-01 00:00:00'
)
LOOP
...

or as @boneist suggested in a comment, with a simpler loop:

BEGIN
FOR num_days in 0..(date '2018-01-01' - date '2017-01-01' - 1)
LOOP
INSERT INTO file_backup
SELECT *
FROM file_core
WHERE datecreated >= timestamp '2017-01-01 00:00:00'
+ numtodsinterval(num_days, 'DAY')
AND datecreated < timestamp '2017-01-01 00:00:00'
+ numtodsinterval(num_days + 1, 'DAY');
COMMIT;
END LOOP;
END;
/

The main problem with this approach is restartability. If there is an error part way through the loop you can't just re-run it, as you'd be inserting duplicates.

Multiple inserts and commits are also less efficient that a single insert, or even multiple inserts and a single commit. If you don't have enough undo space to allow a single transaction to do all the work you need you should be fixing the database configuration to allow that, rather than working around it and potentially compromising data integrity.

i need backup this table . and insert only 2 month in new table

That sounds like you need to partition the table by month and use partition swaps to shift old months from the live to the backup table, perhaps. Partitioning costs more but if you have those data volumes it may be justified.

Failing that you could consider renaming your current table to backup, recreating your original table, and just copying the two months' worth of data you want to keep back to that. But that's a one-off thing, you still have the ongoing problem of ageing records out of the main table and into backup. And it has its own issues with dependencies, constraints, etc.



Related Topics



Leave a reply



Submit