Multi-threading Backups

In the modern world of huge amounts of data backing up databases within a reasonable amount of time has become harder and harder for people which massive amounts of data to do. Several of my clients have so many databases, with so much data on them that they simply can’t complete full backups of the databases within a 24 hour window. This creates some major problems for them as backups are running all through the business day. In these cases where there are lots of databases, which each have large amounts of data the best bet becomes to multi-thread the database backups so that multiple backups can be taken at once.

But doing that gets tricky, fast. If you create a job for each database you have to track that, and manually add new database backup jobs when new databases are added, then you need to track how long it takes to get each database backed up so that you aren’t running to many backups, but you also don’t want the system to be very idle during the maintenance window either.

To get around this for these clients I put together a process that uses 8 database backup jobs per server, allowing for 8 databases to be backed up at a time. As a part of the process, the databases are sorted based on database size so that the duration of the backups across the backup jobs is spread out as evenly as possible. Depending on the skew in your database sizes the jobs may or may not run for the same duration.

In my deployment script it creates a new database called “BackupData”. You can change the name of this if you want, but you’ll need to update all the stored procedures to use whatever database name you specify. This database contains a single table which is a cached copy of the output of sp_helpdb. This is done to reduce the number of times that sp_helpdb is called as this is an expensive stored procedure to run, and the exact values from the stored procedure aren’t needed every time. There’s as separate job which runs this stored procedure nightly to get the updated values.

When you run the below SQL script it’ll create a bunch of SQL Agent jobs as well as the needed database, table and various stored procedure. If you don’t want differential backups you can simply delete the differential jobs and change the full backups to run daily instead of weekly.

You’ll want to change the network paths listed on lines 160-164 to the network share or folder that you want to backup to. In this case I’m using a network share. If you are going to use a network share you’ll want 10 Gig networking in place as well as SMB3 on both the SQL Server and the target server (Windows 2012 or newer) and storage that is fast enough to support writing this quickly. If you are writing to a virtual tape library like a data domain you’ll want to ensure that it supports SMB3 as you’ll be writing a log of data to it very quickly.

This stored procedure code is a little rough and the documentation isn’t all that great, but it gets the job done.

You can download the SQL Script to make all this happen here.



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?