Using batching to do large operations without filling the transaction log

Published On: 2008-07-14By:

Deleting large amounts of data from a table is usually an important task, but if you don’t have a maintenance window to work within then you can end up causing locking and blocking against the other processes which are accessing the database.  Not to mention you will cause a very large amount of data into the transaction log no matter what your transaction logging level is set to.

Say you have a table with a date column and you need to delete a million plus records.  Doing this in a single transaction will put all million transactions into your transaction log, plus cause any other processes which are trying to access the table to be blocked.

 However if we batch the transaction into many smaller transactions our transaction log will not fill up as we can backup the log using our normal log backup methods throughout the process, or if we use SIMPLE recovery on our database then transactions will be removed from the log automatically.

In SQL 2000 and below you have to set the ROWCOUNT session variable to a number above 0, which would cause SQL to delete the first n records that it comes across.  In SQL 2005 we can use the TOP parameter as part of our DELETE command having the same effect, but without having to reset the session variable.

In SQL 2000 or below you can use a syntax like this one.

DECLARE @c BIT, @d DATETIME
SET @d = dateadd(yy, -1, getdate())
SET @c = 0
SET ROWCOUNT = 1000
WHILE @c = 0
BEGIN
DELETE FROM Table
WHERE CreateDate

If you are using SQL 2005 you can use this very similar syntax.

DECLARE @c BIT, @d datetime
SET @d = dateadd(yy, -1, getdate())
SET @c = 0
WHILE @c = 0
BEGIN
DELETE TOP (1000) FROM Table
WHERE CreateDate

Both pieces of code are very similar. Declare a variable which tells the loop when to exit. Then start deleting the data. If no records are deleted, then set the variable to 1 causing us to exit the loop. Now this will usually take a little bit longer to complete than a single delete statement, but the system will continue to be responsive during the process.

The number of records which you are deleting should be adjusted based on the width of your records, the load on the database at the time of deletion, and the speed of your hard drives which hold the data files and transaction logs. I usually start at 1000 records and see how the system responds. For tables which a just a few numbers I'll put it up as high as 50k or 100k records. For very wide tables I'll drop it down to 100 or 500 records if the system can't handle 1000 records.

Denny

Update:
Sorry this post didn't look very good at first. The blog site seams to have eaten it, and I didn't notice until just now.


Contact the Author | Contact DCAC

2 responses to “Using batching to do large operations without filling the transaction log”

  1. YSLGuru says:

    Is it possible to apply this same concept of minimal impact to the Log file when performing maintenace tasks like index rebuilds? We have a 100GB DB with 2 very volatile tables which require Index Rebuilds weekly on a total of 6 indexes. The Log file has 60GB of space to work with and under any other DB activity the log file seldom gets to a fourth of that size. But when we rebuild these indexes, all 60GB are consumed.

    We have the RECOVERY MODEL set to Bluk-Logged, something we can’t change due to requirements by Microsoft’s DPM Software which we use for backups. Curently I’m changing the Reovery Model to SIMPLE then running the Rebulds and switching back to Bulk-Logged once the job is done. This occurs outside normal user hours so there’s nothing else going on with the DB when the rebuilds occur. Even though this approach works, I’d love to finid a way to do this without changing the Recovery Model and without always having such a large amnount of drive space avaiable for the log file.

    Thanks

  2. Mrdenny says:

    No it’s not. There’s no way to batch index rebuild operations.

    By changing the recovery model from bulk-logged to simple you are breaking your transaction log series which will stop you from being able to restore your transaction logs.

    Instead you should trigger a transaction log backup as soon as your reindex job is complete.

    About all you can do would be to do log backups between each index rebuild command. You’ll need to keep that amount of drive space reserved for your transaction log.

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner    Microsoft MVP    Microsoft Certified Master    VMWare Partner    VMWare vExpert
   Best Full-Service Cloud Technology Consulting Company    Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers
  • Facebook
  • Twitter
  • LinkedIn
  • More Networks
Copy link