• Skip to main content
  • Skip to primary sidebar

DallasDBAs.com

SQL Server Database Consulting

  • Services
    • SQL Server Health Check
    • Fractional DBA
  • Pocket DBA®
  • Blog
    • Speaking
  • Testimonials
  • Contact
  • About

HADR

Why Your SQL Server Backups Aren’t Enough

April 9, 2025 by Kevin3NF Leave a Comment

SQL Server backups are database insurance – you don’t really appreciate them until disaster strikes. But here’s the kicker: having backups doesn’t mean your data is safe. If you’re not regularly doing test restores, your backup strategy is a ticking time bomb.

 

The False Sense of Security

Many IT leaders and system admins think, “We have full backups every night. We’re covered.” But when the time comes to restore, they discover:

· The backup file is corrupt.
· The storage location is inaccessible.
· The restore process takes way longer than expected.
· The recovery model wasn’t configured properly.
· The point-in-time restore doesn’t actually bring back the data they need.

At that point, it’s not a “backup strategy.” It’s a data loss incident.

Why Restores Must Be Tested

SQL Server backups are only as good as your ability to restore them. If you’re not proactively testing restores, you’re just hoping they work when disaster strikes. Hope is not a strategy.

• Test different restore scenarios: Full, differential, and transaction log restores.
• Validate integrity with DBCC CHECKDB after restores.
• Ensure RTO and RPO align with business needs. (Translation: Can you restore fast enough to avoid a business meltdown?)
• Automate restore tests so you’re not guessing when the pressure is on.

Automating Restore Tests with dbatools

If you’re not testing restores because it’s “too time-consuming,” dbatools.io has your back. The Test-DbaLastBackup command automates the process by:

1. Restoring the last backup of a database to a test environment.
2. Running DBCC CHECKDB to ensure data integrity.
3. Reporting on success or failure, so you know where issues exist before you actually need the backup.

Example Command:
Test-DbaLastBackup -SqlInstance YourSQLServer -Databases YourDatabase

This takes a lot of manual effort out of the equation—no more excuses for not testing!

Action Plan for SQL Server Resilience

1. Schedule regular test restores (weekly or monthly) and document the process.
2. Use DBCC CHECKDB to verify database integrity post-restore.
3. Leverage Test-DbaLastBackup to automate restore verification.
4. Confirm backup files are accessible and stored securely offsite.
5. Train your team on recovery processes—because backups are useless if no one knows how to restore them.
6. Monitor backup success, but more importantly, track restore success.

The Bottom Line:

A backup plan without restore testing is a disaster waiting to happen. Don’t wait until you’re knee-deep in downtime to realize your backups weren’t as solid as you thought. Test your restores. Validate your data. Sleep better at night.

Thanks for reading!

–Kevin

New Pocket DBA® clients get the first month FREE!

Get a Free SQL Server Health Check!

 

 

Filed Under: backup, HADR, Restore, SQL Tagged With: syndicated

SQL Server Backups are Not a High-Availability Solution

October 17, 2023 by Kevin3NF Leave a Comment

Please continue doing your backups!

Backups are Disaster Recovery, yes…but not HA.

Some will argue with this (in the comments most likely), but I broadly define “High Availability” as a system that can recover in seconds or minutes at most. Sometimes that is automatic, sometimes manual.

Backups might be quick to restore IF they are small enough and the right people are in place (not at lunch or otherwise out). But automated restores to prod just aren’t a thing.

SQL Server has this cute little marketing term called “Always On” which is nonsense. Always? Really? 9 Nines?

Always On covers both Failover Cluster Instances and Availability Groups. There are significant differences between the two. Both depend on the O/S Cluster…and they diverge a LOT from there. They are both HA.

Log Shipping (ancient tech) is great for DR and hitting your RPO number, but failover to a log shipped copy is manual.

Replication is not and never will be an HA or DR solution. Some things cannot be replicated, so they are lost if the publication database goes poof.

There are of course things outside of SQL Server that can help you hit your RPO/RTO goals. Feel free to share them.

What are you using for your HA solution?

Thanks for reading!

Kevin3NF

Follow @Dallas_DBAs

Filed Under: backup, HADR, SQL Tagged With: syndicated

SQL Server HADR overview

March 25, 2020 by Kevin3NF 2 Comments

My Pluralsight course for new SQL Server DBAs

Image credit to Jeff (t)

Back in June of 2019, I published this YouTube video covering the highlights of the various SQL Server High Availability and Disaster Recovery options. But I didn’t do any of it in writing like I usually do in the companion piece…so here we go!

First, some definitions:

  • High Availability (HA)—typically means that the database will be back online in seconds or minutes, not hours or days
  • Disaster Recovery (DR)—the ability to bring the database server and databases online after something really bad happens
  • HADR—An umbrella term covering any feature that encompasses HA, DR or both.
  • RPO—Recovery Point Objective—Essentially governs what point we can recover to (or…how much data loss can we handle)
  • RTO—Recovery Time Objective—How long to get everything running again.

SQL Server comes with a number of features that cover different scenarios, in different ways.  Some are only in the Enterprise edition.

I am going to cover these at a very high level so you can quickly get the idea in your head, and refer back to this post as needed later. We will start with the most basic.

Backup and Restore

It sounds simple, but backing up the databases is the simplest, most cost effective choice in a DR solution.  BACKUP DATABASE is available in all editions of SQL Server. Express edition makes it hard to automate, because there is no SQL Agent functionality.  There are answers for that, such as Windows Task Scheduler and the Ola Hallengren Scripts.

Backups by themselves are great, but in the event of an actual disaster, the ability to RESTORE them is the critical part. You should be backing up and test restoring your databases regularly.

This is database level only.  Backup all user and system databases.

Backup and Restore is DR, not HA.

Log Shipping (LS)

Transaction Log Shipping at its core is just a Backup-Copy-Restore process with some bells and whistles added to it.  GUI for setup, alert jobs to let you know when it gets behind, etc. But the basic concept is that every 15 minutes a T-log backup is taken, then copied to another server, then restored there.  In LS, you can set up the Secondary server to be read-only between restores. You can have multiple Secondaries.

This is database level only, so Logins, jobs, etc. are not copied.

You need to be familiar with File shares, UNC paths, permissions settings, etc. to set LS up.

Log Shipping is DR only, not HA unless you write a bunch of scripts to detect an issue, catch it all up and repoint your applications to the new Primary.

Database Mirroring (DBM):

Database mirroring is a deprecated feature, but it still exists in some recent versions of SQL Server. Check the Microsoft documentation for your version. DBM is available in Standard and Enterprise editions. As the name implies, the Principal copy of the database is mirrored to the Mirror copy on a secondary server.

DBM works at the transaction level, unlike Log Shipping which uses T-log backups. DBM can be set to be either Synchronous or Asynchronous modes, and either Automatic or Manual failover. Not all combinations of these two exist.

This is Database level, so jobs, logins, etc. do not participate.

Since Automatic failover is available, I classify DBM as HA, and DR at the Database level.

Always On—a marketing term from MS that includes 2 features:

SQL Server Failover Cluster Instance (FCI)

This is a traditional Windows Cluster (WSFC) that has been around for ages, with one or more SQL Server Instances installed into it.

Failover is handled by the Windows Cluster service and is usually very quick. A few seconds, but exceptions exist, such as 10000 databases all starting up at the same time.

An over-simplified explanation—2 or more servers (nodes) are connected to a SAN. The databases exist on the SAN, so in a failover situation, they don’t move.

If the SQL Instance on Server A becomes unreachable, the cluster stops the SQL Service there and starts it on Server B.

The SAN is a single point of failure, unless it is being replicated via some non-SQL Server technology. For this reason, a SQL FCI (in my opinion) is not a full HADR solution…but definitely IS HA.

SQL Server Availability Groups (AGs):

AGs are an Enterprise only feature as of this writing.

AGs are essentially a much-improved version of Database Mirroring. Transaction level data movement from Primary to Secondary for one or more databases in a Group. Multiple groups are allowed. Synchronous or Asynchronous. Manual or Automatic failover. Readable secondaries to offload reporting queries.

Standard Edition has a “Basic Availability Group” which has lots of limitations, chief among these being one database per group.

When set up correctly, AGs are both HA and DR for the user databases, with no single point of failure. There are script options to keep the jobs, logins, etc. in sync between the Primary and all Secondary replicas.

A Windows Cluster is required, but a SAN is not. AGs work off local storage, not shared since there is a copy of each database on each server participating.

The licensing of AGs has changed a lot, so I won’t get into it here…but you probably already know that Enterprise Edition licenses are VERY expensive. Plan accordingly with your license vendor.

Replication—a very Special Snowflake:

SQL SERVER REPLICATION IS NEITHER HA NOR DR.

Not everything in a SQL Server user database CAN be replicated, such as users, or tables with no Primary Key. New objects are not automatically sent from Publisher to Subscriber. System databases are not replicated.

Replication IS a great option to send a subset of your data to another server for reporting, or for filtering by region or salesperson.

Replication is for Distributed Data Processing. Backup and Restore beats Replication when you have to rebuild an environment, every time.

Summary:

I hope this helps you have a better understanding of the high level concepts of the various HA/DR options available to you. There are “gotchas” and details that are impossible to cover in this post. But, all of these are extensively well documented by Microsoft and the SQL Server Community at large.

Thanks for reading!

Kevin3NF

Follow @Dallas_DBAs

Filed Under: HADR Tagged With: syndicated

T-SQL Tuesday #118: Fantasy SQL Feature

September 10, 2019 by Kevin3NF Leave a Comment

T-SQL Tuesday is a monthly blog party for the SQL Server community. It is the brainchild of Adam Machanic (b|t) and this month’s edition is hosted by Kevin Chant (b|t), SQL dude and fellow cyclist!  Kevin wants to know what our “Fantasy SQL Feature” is.

I asked my LinkedIn connections a very similar question in my “Question of the Week” there.  Hit the link to see a lot of different responses.

One of the recurring themes and possibly the most common was:

Load Balanced Writes.  Unless I’ve completely missed something, there is nothing in SQL Server natively that will allow you to write to different copies of the database (leaving out Merge and Bi-Directional Transactional Replication, because they suck and don’t scale).

Don’t get me wrong…I love read-only replicas in an Availability Group, replication for reporting, or even delayed Log Shipping.  They are wonderful for taking the read traffic off an OLTP box, but if you are dying under heavy load in a very optimized setup…oh well.

Bonus Fantasy: Give me a button in the Log Shipping Setup GUI (Database Properties>>Log Shipping) that just says “Re-initialize”.  Sometimes LS just falls apart and its easier to tear it down and start over.  Even easier is to just back up the db, restore over the Secondary db and go.  But, for the small business using LS for poor-man’s DR and no DBA on staff…give me a button.  Ideally that button calls sp_ReInit_Log_Shipping @DBName = ‘MyDB’.  But someone will have to write that first.  I am not that someone.  This may already be in the DBATools.io set of toys, but I haven’t looked, and again…small customers need a button in a GUI.

Thanks for reading!

Kevin3NF

Follow @Dallas_DBAs

Filed Under: HADR, Performance, TSQL2sday

Video: SQL Server HA/DR Overview

June 24, 2019 by Kevin3NF Leave a Comment

This is meant to be a high-level overview, not an exhaustive setup guide.  I skip over some relevant bits so the main points don’t get buried.

In my DBA Fundamentals pre-conference, we spend a good hour going over the various High Availability and Disaster Recovery options available in SQL Server.  From simple Backup/Restore up to Distributed Availability Groups.

This video is a more succinct version of that conversation.   Enjoy!

If you have questions or comments, please add them in the YouTube comments, so I’ll see them more quickly.

Thanks for reading!

Kevin3NF

Follow @Dallas_DBAs

Filed Under: HADR, video

Your DR Plan isn’t a Plan

August 8, 2018 by Kevin3NF 9 Comments

 

My apologies, but someone had to tell you.

You don’t have a Disaster Recovery Plan™. You have a Disaster Recovery Hope. (paraphrased from a source I’ve forgotten…)

If I’m wrong (and I hope I am), its because you are in the 10% (optimistically) of companies that actually test their DR plans and document the results.

In the 18 years I’ve been asking the following question, I’ve gotten one excellent and correct response, and one that was good, but I still poked holes in it.

“If your most important database server melted right now, how long until you are back online?”

I get a lot of responses on this question:

  • 2 days
  • 30 minutes
  • I don’t know
  • Blank Stares (its a response…just a scary one)

If you are not testing your DR plan, you may be betting your company’s existence on a flawed premise: “Our people know what to do. We have backups.”

If your DR plan is a digital “living document” that lives on your Sharepoint server, you are already in trouble.  Everyone involved needs a current, non-digital copy on their desk and at their home.

Disaster recovery is about far more than restoring databases.  But, for my mostly-SQL Server audience, let me ask you this:

Can you do an emergency test restore of your most mission critical database onto identical hardware, to a point 27.5 minutes ago and have everything work as expected?

If you don’t do test restores, you are HOPING your backups are valid. If you don’t have the ability to replace critical hardware or spin up an identical VM, you are HOPING that your server or SAN never fails.

If you are running through your DR Plan regularly, with everyone ready to go and knowing their place in the process…you are HOPING the next disaster happens when everyone is available, online and up to speed.   What if that disaster happens when your Sr. DBA is on vacation?  Who is her backup? A Jr. DBA?

If you don’t know the answer to the “How long…” question above, you really should go find out who in your organization does.  If nobody does, contact me.

If you think you have it 100% nailed…I challenge you to let me into your server room unattended.

Thanks for reading!

Kevin3NF

Follow @Dallas_DBAs

Filed Under: backup, HADR

  • Go to page 1
  • Go to page 2
  • Go to Next Page »

Primary Sidebar

Search

Sign up for blogs, DBA availability and more!

Home Blog About Privacy Policy
  • Home-draft
  • Blog
  • About Us

Copyright © 2025 · WordPress · Log in

 

Loading Comments...