“SQL Server Is Slow” – Part 2 of 4

The 10-Minute Outside-In Triage

Don’t Blame SQL First

It’s 9:05 AM and your helpdesk lights up: “The SQL Server is down. Nothing works.”

By 9:07, everyone is staring at you.

The trap: you open SSMS and start digging for blocking queries. But what if the database isn’t the problem at all?

I’ve seen teams lose hours chasing SQL when the real culprit was a Windows power plan, an antivirus update, or a snapshot on the storage array.

That’s why step two in our framework is a 10-minute outside-in triage. You prove or eliminate the usual suspects before you go digging inside SQL Server.

The Outside-In Checklist

When an incident is happening right now, you don’t need deep analysis. You need to rule out the obvious and collect proof.

Step 1: Blast Radius (1 min)

Are other apps/servers slow too?
Other users, or just one?
Cloud? Check provider status pages and throttling alerts.
If everything is slow, SQL may not be the root cause.

Step 2: Host & OS Health (3 – 4 min)

CPU: Is the server pegged, not just sqlservr.exe?
- I have a story here…coined the term “three finger salute” from it (CTRL-ALT-DELETE). Ask me if we ever meet in person!
Hypervisor: Look for VM “ready time” or “steal time.”
Power plan: Make sure Windows is on High Performance.
Memory: Check paging, commit vs installed.
Antivirus: Any active scans?
Windows Update: Patches or Defender scans running?

Step 3: Storage Sanity (3 – 4 min)

Latency: Are read/write times spiking?
Queue depth: Backlog on MDF, LDF, or tempdb volumes?
Snapshots/backup agents: Running now?
Cloud disks: Bursting credits exhausted?

Step 4: Network Checks (1 – 2 min)

Is RDP sluggish? File copies slow?
Any new firewall/VPN/SSL changes?

Step 5: Quick SQL Health Pulse (1 – 2 min)

(You’re not tuning queries here, you’re checking basic health)

Blocking chains: is one session holding everyone else hostage? Sp_whoisactive is great for this!
Current waits: PAGEIOLATCH (storage), WRITELOG (log), ASYNC_NETWORK_IO (client/network). Use SQLSkills Waits Now script
Active jobs: backups, CHECKDB, index maintenance colliding with business hours.

The Bottom Line

The point of a 10-minute triage isn’t to fix everything. It’s to prove or eliminate SQL Server as the culprit.

When you walk into a war room and say, “Here’s proof it’s the storage snapshot, not SQL,” you stop the blame game cold.

At Dallas DBAs, we use this exact checklist when clients call in a panic. It’s fast, repeatable, and it keeps you from chasing the wrong rabbit.

“SQL Server Is Slow” – Part 2 of 4

The 10-Minute Outside-In Triage

Don’t Blame SQL First

The Outside-In Checklist

Step 1: Blast Radius (1 min)

Step 2: Host & OS Health (3 – 4 min)

Step 3: Storage Sanity (3 – 4 min)

Step 4: Network Checks (1 – 2 min)

Step 5: Quick SQL Health Pulse (1 – 2 min)

The Bottom Line

Like this:

Related

Leave a Comment

Recent Posts

Ad Hoc SQL Server Help

SQL Server Upgrades: In-Place or New Server?

SQL Server Reliability: Clearing the Fuzzy Thinking

Sign up for our Newsletter

“SQL Server Is Slow” – Part 2 of 4

The 10-Minute Outside-In Triage

Don’t Blame SQL First

The Outside-In Checklist

Step 1: Blast Radius (1 min)

Step 2: Host & OS Health (3 – 4 min)

Step 3: Storage Sanity (3 – 4 min)

Step 4: Network Checks (1 – 2 min)

Step 5: Quick SQL Health Pulse (1 – 2 min)

The Bottom Line

Share this:

Like this:

Related

Leave a Comment

Recent Posts

Ad Hoc SQL Server Help

SQL Server Upgrades: In-Place or New Server?

SQL Server Reliability: Clearing the Fuzzy Thinking

Sign up for our Newsletter