How to Efficiently Delete Rows While Not Using Truncate Table in a 500,000+ Rows Table

How to delete large data of table in SQL without log?


  1. If you are Deleting All the rows in that table the simplest option is to Truncate table, something like

     TRUNCATE TABLE LargeTable
    GO

Truncate table will simply empty the table, you cannot use WHERE clause to limit the rows being deleted and no triggers will be fired.


  1. On the other hand if you are deleting more than 80-90 Percent of the data, say if you have total of 11 million rows and you want to delete 10 million another way would be to Insert these 1 million rows (records you want to keep) to another staging table. Truncate this large table and Insert back these 1 million rows.

  2. Or if permissions/views or other objects which has this large table as their underlying table doesn't get affected by dropping this table, you can get these relatively small amounts of the rows into another table, drop this table and create another table with same schema, and import these rows back into this ex-Large table.

  3. One last option I can think of is to change your database's Recovery Mode to SIMPLE and then delete rows in smaller batches using a while loop something like this:

     DECLARE @Deleted_Rows INT;
    SET @Deleted_Rows = 1;


    WHILE (@Deleted_Rows > 0)
    BEGIN
    -- Delete some small number of rows at a time
    DELETE TOP (10000) LargeTable
    WHERE readTime < dateadd(MONTH,-7,GETDATE())

    SET @Deleted_Rows = @@ROWCOUNT;
    END

and don't forget to change the Recovery mode back to full and I think you have to take a backup to make it fully effective (the change or recovery modes).

Deleting millions of rows in MySQL


DELETE FROM `table`
WHERE (whatever criteria)
ORDER BY `id`
LIMIT 1000

Wash, rinse, repeat until zero rows affected. Maybe in a script that sleeps for a second or three between iterations.

Deleting 1 millions rows in SQL Server

Here is a structure for a batched delete as suggested above. Do not try 1M at once...

The size of the batch and the waitfor delay are obviously quite variable, and would depend on your servers capabilities, as well as your need to mitigate contention. You may need to manually delete some rows, measuring how long they take, and adjust your batch size to something your server can handle. As mentioned above, anything over 5000 can cause locking (which I was not aware of).

This would be best done after hours... but 1M rows is really not a lot for SQL to handle. If you watch your messages in SSMS, it may take a while for the print output to show, but it will after several batches, just be aware it won't update in real-time.

Edit: Added a stop time @MAXRUNTIME & @BSTOPATMAXTIME. If you set @BSTOPATMAXTIME to 1, the script will stop on it's own at the desired time, say 8:00AM. This way you can schedule it nightly to start at say midnight, and it will stop before production at 8AM.

Edit: Answer is pretty popular, so I have added the RAISERROR in lieu of PRINT per comments.

DECLARE @BATCHSIZE INT, @WAITFORVAL VARCHAR(8), @ITERATION INT, @TOTALROWS INT, @MAXRUNTIME VARCHAR(8), @BSTOPATMAXTIME BIT, @MSG VARCHAR(500)
SET DEADLOCK_PRIORITY LOW;
SET @BATCHSIZE = 4000
SET @WAITFORVAL = '00:00:10'
SET @MAXRUNTIME = '08:00:00' -- 8AM
SET @BSTOPATMAXTIME = 1 -- ENFORCE 8AM STOP TIME
SET @ITERATION = 0 -- LEAVE THIS
SET @TOTALROWS = 0 -- LEAVE THIS

WHILE @BATCHSIZE>0
BEGIN
-- IF @BSTOPATMAXTIME = 1, THEN WE'LL STOP THE WHOLE JOB AT A SET TIME...
IF CONVERT(VARCHAR(8),GETDATE(),108) >= @MAXRUNTIME AND @BSTOPATMAXTIME=1
BEGIN
RETURN
END

DELETE TOP(@BATCHSIZE)
FROM SOMETABLE
WHERE 1=2

SET @BATCHSIZE=@@ROWCOUNT
SET @ITERATION=@ITERATION+1
SET @TOTALROWS=@TOTALROWS+@BATCHSIZE
SET @MSG = 'Iteration: ' + CAST(@ITERATION AS VARCHAR) + ' Total deletes:' + CAST(@TOTALROWS AS VARCHAR)
RAISERROR (@MSG, 0, 1) WITH NOWAIT
WAITFOR DELAY @WAITFORVAL
END

Removing rows without transaction logging?

In simple terms, you can't get around the transaction logging. That's just how the database ensures consistency - if the transaction fails halfway through (or the server's power fails, for example), the database engine needs to know how to get into a consistent state again. Also, appending the things to be changed into the transaction log is much faster than actually performing a change on the data files of the DB, especially in cases like yours.

There's a few special cases where it's safe to get around those things - truncate table will remove all the rows at once, and only if the table has no foreign keys, which makes it rather trivial. You can't limit it in any way, though.

The newly free space will be reclaimed as part of the database maintenance cycle. During each database backup, the database is synchronized to have all the data written in the data files, and the transaction log is backed up and emptied in the DB itself (I'm oversimplifying, since there's a lot of possible configurations - in any case, this is something your DBA should care about).

If this is posing a problem to you, the solution wouldn't be to get around the transaction logging anyway. You probably want to ask why (and how often) you need to delete millions of rows at a time.

Oracle database truncate table (or something similar, but not delete) partially based on condition

Depends on how your table is organised.

1) if your (large) table is partitioned based on similar condition ( eg. you want to delete previous month's data and your table is partitioned by month), you could truncate only that partition, instead of the entire table.

2) The other option, provided you have some downtime, would be to insert the data that you want to keep into a temporary table, truncate the original table and then load the data back.

insert into <table1>
select * from <my_table>
where <condition>;

commit;

truncate table my_table;

insert into my_table
select * from <table1>;

commit;

--since the amount of data might change considerably,
--you might want to collect statistics again

exec dbms_stats.gather_table_stats
(ownname=>'SCHEMA_NAME',
tabname => 'MY_TABLE');

Delete all rows in table

Truncate table is faster than delete * from XXX. Delete is slow because it works one row at a time. There are a few situations where truncate doesn't work, which you can read about on MSDN.



Related Topics



Leave a reply



Submit