• Skip to main content
  • Skip to primary sidebar

DallasDBAs.com

SQL Server Database Consulting

  • Services
  • Pocket DBA®
  • Blog
  • Testimonials
  • Contact
  • About

SQL

SQL Server Upgrades: In-Place or New Server?

December 31, 2025 by Kevin3NF Leave a Comment

This isn’t a religious debate.

 

I have a client right now debating how to handle SQL Server upgrades across all of their dev and test environments.

And it’s a good conversation, because the same logic people use for dev and test is often what sneaks into production later.

There are two paths. Both can work. Both can fail, but in very different ways.

Option 1: In-Place Upgrade

An in-place upgrade modifies the existing SQL Server instance where it sits.

Why some people choose it

  • It’s “fast”.
  • No application changes.
  • No database backup / restore
  • No DNS, aliases, or firewall conversations.
  • Lower short-term cost.

For dev and test, that simplicity is tempting.

The risk profile

Once SQL Server upgrades its system databases, you don’t roll back with a checkbox.

If something goes wrong late in the process, your options narrow quickly:

  • restore system databases
  • restore user databases
  • or rebuild and recover

Downtime tends to be “whatever it takes,” not “whatever you planned.”

The safety net you must have

An in-place upgrade should never rely on SQL backups alone.

You need two rollback paths:

Platform-level rollback

  • VM snapshot or full image backup taken immediately before the upgrade
  • Snapshot verified
  • Snapshot removed after validation (don’t leave them lingering)

SQL-level rollback

  • Full backups of all user databases
  • Backups of master, msdb, and model
  • Certificates and encryption keys if applicable

Add a captured baseline and a written stop/go decision, or you’re flying blind.

When in-place makes sense

  • Dev and test systems
  • Simple, single-instance servers
  • Environments where rebuild is acceptable
  • Tight timelines with known risk tolerance

Option 2: New Server + Migration

  • This means building a new server and moving data and workloads over.

Why teams choose this

  • Rollback is simple: point applications back.
  • You can test without pressure.
  • You can fix long-standing issues instead of inheriting them.
  • OS, SQL, and configuration changes are cleaner.

This approach favors control over speed.

The tradeoffs

  • You take on more coordination:
  • logins and permissions
  • SQL Agent jobs
  • linked servers and credentials
  • application connection routing
  • Missed “things” or rarely used features.

And data movement takes time – there’s no way around physics, but there are ways to mitigate this for short cutover windows.

Failures tend to be quieter and easier to reverse.

How to decide

If speed and simplicity matter most, in-place is attractive.
If rollback certainty and testing matter more, migration wins.

That’s why dev and test often lean towards in-place. Production should pause before copying that decision.

The first 24 hours matter

No matter which path you take:

  • watch agent jobs
  • watch error logs
  • confirm backups are running
  • validate real business workflows
  • compare performance to your baseline

Most upgrade failures don’t happen during setup.
They happen after users return.

The Bottom Line

If you don’t have a tested rollback plan, you don’t have an upgrade strategy.

Dev and test decisions shape production thinking, whether you intend them to or not.

Choose based on risk, not convenience.

 

Don’t let SQL Server make you miss your kid’s piano recital!
First month free for new clients

Pocket DBA®

Filed Under: SQL, Upgrade Tagged With: syndicated

SQL Server Reliability: Clearing the Fuzzy Thinking

November 19, 2025 by Kevin3NF Leave a Comment

The terminology around reliability is a mess


If you’ve ever said, “We’re covered, it’s replicated,” you’re in good company.

 

SQL Server is a massive, 35+ year-old product that has changed, renamed, and re-imagined its features so many times that even seasoned IT pros get tripped up. Between log shipping, mirroring, replication, snapshots, Availability Groups, clustering, and VM backups, it’s no wonder managers end up with fuzzy expectations about what keeps their data safe.This isn’t a failure of leadership; it’s the side effect of a product that’s both deep and deceptively familiar. You’ve got “copies,” “replicas,” and “backups” everywhere, but they don’t all mean the same thing.Let’s clear the fog.

Replication Isn’t Disaster Recovery

Replication was born for scale, not survival. It’s great for reporting, distributing data to remote sites, and keeping subsets of tables up to date. But if the publisher goes down? There’s no built-in recovery process. You’ll spend more time re-initializing and troubleshooting than restoring from a backup.

In addition, the subscribers are read-write. This means someone downstream can delete a row, causing updates at the publisher to not get moved, which can cause a huge issue.

Replication moves data, not databases. It’s not a safety net; it’s a delivery truck.


Log Shipping Is DR, But Not HA

Log shipping is a solid, dependable form of disaster recovery. It keeps a warm copy ready for manual failover, which is perfect for a secondary data center or a standby VM in the cloud.

But it’s not high availability. There’s no automatic failover, no continuous read/write capability, and your secondary is always behind by at least one transaction log.

Think of log shipping as a lifeboat, not a bridge. It’ll get you safely across the storm, but only after you climb in.


Snapshots Aren’t Backups

A SQL Server database snapshot freezes a view of a database at a moment in time — handy for quick rollbacks or comparing changes. A VM snapshot, on the other hand, captures the state of a virtual machine. Neither one replaces a backup that can be restored independently to a new server.

If you are snapshotting the entire server, how does that help you restore to a point in time for one database without bringing all of them backwards? That’s a really awkward conversation to have with the CEO.

Snapshots are shortcuts, not insurance policies.


Availability Groups Don’t Replace Backups

Even the shiniest Always On Availability Group can’t save you from data corruption, accidental deletes, or bad deployments that replicate instantly across every node.

Availability is not durability. You still need backups — tested, verified, and stored safely elsewhere.

An AG keeps you online, not immune.


Why It Gets Fuzzy

SQL Server grew up in an enterprise world, full of legacy features that overlap and evolve. Microsoft rarely deprecates things outright, so we’re left with a toolbox that has five ways to move data and ten ways to misunderstand them.

Don’t feel bad. SQL Server’s feature naming team must have had a bet going to see how many words they could reuse. These don’t help:

  • Replication sounds like redundancy.
  • Availability sounds like recovery.
  • Snapshots sound like backups.

That’s why it takes a DBA’s mindset. Understanding and translating these terms takes intentionality and years of exposure to them.


What can you do?

  1. Inventory your protection layers. List what you’re using (replication, AGs, log shipping, backups) and define what each one really does.
  2. Map to business needs. High availability ≠ disaster recovery ≠ data protection. You probably need all three, in different forms.
  3. Test the ugly scenarios. Shut down primaries, take databases offline, pull power cords (in dev). You’ll learn fast which “solution” is just a hope.
  4. Document the real RTO/RPO. Your CFO doesn’t care if it’s log shipping or AGs — they care how long you’ll be down and how much data you’ll lose.

The Bottom Line

Fuzzy thinking is normal, but it’s dangerous when disaster strikes.
Replication moves data. Log shipping recovers data. Availability Groups keep you online.
Only backups and testing keep you safe.

When you know what’s really protecting you, you can make confident, budget-friendly choices, not guesses wrapped in acronyms.


Reach out for a free basic Health Check

We can focus on your server’s Reliability and Availability.

Contact Us


SQL TidBit:

Want to see which databases are actually protected? This quick check shows the database name, recovery model, last full backup, and last log backup times:

-- This is code from the internet...test before you run
SELECT
    d.name AS [Database],
    d.recovery_model_desc AS [RecoveryModel],
    MAX(CASE WHEN bs.type = 'D' THEN bs.backup_finish_date END) AS [LastFullBackup],
    MAX(CASE WHEN bs.type = 'L' THEN bs.backup_finish_date END) AS [LastLogBackup]
FROM sys.databases d
LEFT JOIN msdb.dbo.backupset bs
    ON bs.database_name = d.name
WHERE d.name <> 'tempdb'
GROUP BY d.name, d.recovery_model_desc
ORDER BY d.name;

Note: If a database is in FULL recovery model but has no recent log backup, your log file is probably growing and could eventually cause a disk-full outage.


Link Party

  • Erin Stellato: “The Accidental DBA Day 23 of 30: SQL Server HADR Features”
  • Redgate / Simple-Talk: “SQL Server Backup and Restore for the Accidental DBA”
  • SQLShack: “Understanding Database Snapshots vs Database Backups in SQL Server”</li >
  • Brent Ozar: “Transactional Replication doesn’t solve your HA or DR pains”
  • Microsoft Docs: “Log Shipping and Replication (SQL Server)”

Please send to your manager to help clarify the differences in SQL Server HA, DR, and distributed computing. They deserve a clear cheat sheet.

 

Filed Under: HADR Tagged With: syndicated

SQL Server Versions: Out With the Old, In With the Supported

October 29, 2025 by Kevin3NF Leave a Comment

If your production SQL Servers are still running 2016 (or older) you’re basically banking on inertia. Sure, it’s been stable. But that doesn’t guarantee it’ll stay safe or compliant.

Microsoft shut off mainstream support for 2016 back in July 2022, and extended support ends in July 2026. Beyond that? You’re on your own for bug fixes, security updates, or emergency patches.

 

What You’re Missing

It’s easy to view upgrades as optional enhancements; in truth, staying current is about maintaining resilience. What you gain with 2019/2022 isn’t just bells and whistles. It’s reliability, defensive tools, and measurable performance.

 

Smarter Engines Under the Hood

“Better defaults” are no marketing fluff. From improvements in memory grants, parallelism, and hash joins, newer SQL Servers are tuned to make your workload more efficient out-of-the-box.

 

Adaptive Behavior Without Rewrites

Here’s where SQL Server 2019 and 2022 quietly earn their keep. Microsoft invested heavily in the Intelligent Query Processing (IQP) stack – features that make your existing code run better without touching a line of T-SQL (most of the time).

Older versions execute queries based on a single snapshot of estimated data volume, join paths, and parameter values. If those estimates are off (and they often are), the engine makes bad choices and never looks back. The newer engines don’t do that anymore.

Adaptive joins can switch between nested loop and hash join strategies while the query runs, based on how much data actually flows through. That means fewer “query plans from hell” when parameter values swing wildly between executions.

Interleaved execution gives the optimizer a second chance – especially for multi-statement table-valued functions. Instead of assuming a generic row count of “1,” SQL Server now runs the first statement, learns the real cardinality, and uses that for the rest of the plan.

Table variable deferred compilation fixes one of the longest-standing developer pain points. Instead of guessing that a table variable has exactly one row (which breaks most real-world queries), the engine waits until the table is populated, measures it, and builds an informed plan.

And if your code uses scalar user-defined functions , SQL Server 2019+ can inline them, turning what used to be a loop into a set-based operation. That alone can turn a 5-minute report into a 5-second one.

The beauty here is that you may not need to rewrite or refactor anything. You just get smarter plans, more consistent performance, and less time spent chasing parameter sniffing ghosts. All of the above have limitations. Do your homework and proper testing.

 

Faster Recovery & Safer Rollbacks

Ever had a long-running rollback or crash recovery hang your system? Accelerated Database Recovery (ADR) changes the game—making rollbacks and crash recoveries significantly faster, which is a safety net when things go sideways.

 

Query Store on Steroids

In older versions, you’d turn on Query Store, fiddle with settings, maybe capture plan regressions. In newer versions, it’s more mature, more integrated, and more automatic. You get insights, forced plan control, and regression protection with minimal overhead.

 

Security That Doesn’t Feel Optional

TDE, always encrypted, ledger capabilities (in 2022), granular auditing – these aren’t checkboxes anymore, they’re baseline expectations. Newer versions make it less painful to stay compliant and secure.

 

Hybrid & Cloud-Aware by Design

Backup to URL, cross-environment DR, and more. The newer SQL Server versions are built from the ground up to span on-prem, cloud, or hybrid without the constant “lift and re-architect” panic.

 

Predictability & Fewer Surprises

2019 and 2022 have been battle-tested at this point. Most of the early-stage regressions, bugs, or inconsistent behaviors have been discovered and addressed in the Cumulative Updates. That predictability is worth its weight in gold when you’re managing risk.

 

The Bottom Line

Stable isn’t the same as safe. SQL Server 2016 had a great run, but it’s time to let it retire gracefully.
Plan your move to 2019 or 2022 this quarter. Skip 2025 for now. You’ll sleep better.

———————————————————–

Need Migration Help?

Reach out to Dallas DBAs with code “Newsletter”

Contact Us

———————————————————–

Thanks for reading!

Filed Under: Migration, SQL, Upgrade Tagged With: syndicated

SQL Server Migration Overview

October 22, 2025 by Kevin3NF Leave a Comment

It’s Not Just Backup / Restore

At some point every company faces it: the SQL Server that’s been quietly running for years is due for retirement. Maybe the hardware is old, the lease is ending, or your CIO wants to move workloads to the cloud.

From the outside, a SQL Server “migration” can sound like a simple lift-and-shift. Just copy the databases over, right? The reality is closer to moving offices. You don’t just grab every box and throw it into a new building. You measure the space, update the wiring, decide what gets upgraded, and make sure everyone can find their desk again on Monday.

The Big Picture

 

Predict & Provision

The new environment needs to handle both today’s workload and tomorrow’s growth. Simply matching your old CPU, RAM, and storage can be a mistake if your business has grown since the last server was purchased. In the cloud, it’s even more important to right-size. Too small and you’ll choke performance, too large and you’ll bleed money. Planning capacity up front avoids both. For cloud VMs, provision low during testing and bump up the size as needed.

Install & Configure

SQL Server isn’t plug-and-play. A fresh installation with updated patches and best-practice settings sets the stage for stability. This is where you decide things like where to place (and separate) data and log files, how many tempdb files to allocate, and which default settings to avoid. A solid foundation here can prevent countless problems later.

Tune the Source Before the Move

One of the biggest mistakes in any migration is bringing old baggage (technical debt) into a new system. Giant log files, bloated indexes, and unnecessary jobs can cause just as much trouble on shiny new hardware as they did before. Tuning the source first is similar to decluttering your house before moving – you start fresh without dragging the junk along. Or cleaning the bathroom before your housekeeper shows up 😉

Move the Data

Databases aren’t the only things that need to come across. Logins, SQL Agent jobs, linked server definitions, and security settings are just as important. If you miss these, users may not be able to connect, backups may not run, or nightly jobs could fail. Successful migrations treat this as a holistic move, not just a database restore. There are multiple approaches to this, depending on your data size and cutover window.

Test, Test, Test

Once the new server is up, applications need to prove they can connect and perform. Something as small as a changed network name or a forgotten firewall rule can cause chaos. Testing gives you a safe window to discover what doesn’t carry over cleanly. It’s also a chance to capture new performance baselines so you can measure improvement.

Final Cutover

The actual “move day” should be planned, short, and closely monitored. Typically this means scheduling downtime, running one last backup and restore, and redirecting applications or DNS. The next 48 hours are critical: you’re confirming not only that the server is online, but that backups succeed, jobs run, and performance holds steady. With good prep, the cutover feels more like flipping a switch than rolling the dice.

For large databases in the TB+ range, a full backup/restore during the week with only a Differential needed on cutover day can reduce the amount of time dramatically.

Why Preparation Matters

Here’s the piece many companies miss: migrations are an opportunity to fix what wasn’t working. If you had jobs that failed silently, indexes that were never used, or security shortcuts, they’ll follow you into the new system unless you address them first. Treating the migration as a reset, or a chance to leave bad habits behind, means the business not only gets a new server, but a more reliable platform for the future.

The Bottom Line

A SQL Server migration is less about moving bits and more about moving confidence. With the right planning, you don’t just get a new server – you get a healthier, more reliable foundation for your business applications.

 

Free Disaster Readiness Quiz

I’ll trade you an email address for an honest assessment

DR Quiz – Are you ready?

 

Thanks for reading!

Filed Under: Migration Tagged With: syndicated

SQL Server Alerts

October 15, 2025 by Kevin3NF Leave a Comment

Don’t Let Trouble Sneak Up on You

 

Most SQL Servers run quietly. Until they don’t. By the time someone notices an application outage or a failed backup, you’re already behind. That’s why SQL Server’s built-in alerts exist – they give you an early warning before small problems become major outages.

There are a bunch of great 3rd party tools and community scripts available, but not every firm is going to make that investment or allow open-source code on their servers.

SQL Server Alerts are Microsoft-supported, built into the product, and rely on Database Mail for notifications. Configure them once, and you’ll have a safety net that runs 24/7. But like smoke detectors, too many false alarms and you’ll start ignoring them.

 

Step 1: Create an Operator

An operator is just the person (or distribution list) that gets notified.

In SSMS:

  • SQL Server Agent >> Operators >> (right-click) New Operator
  • Fill in a name and email address (use a group if possible). 100 character limit.

 

T-SQL Example:

USE msdb;

EXEC msdb.dbo.sp_add_operator  

    @name = N'DBA On Call',  

    @enabled = 1,  

    @email_address = N'[email protected]';

 

 

Step 2: Define the Alert

Alerts can fire on:

  • Specific errors (e.g., error 823 = disk I/O issue)
  • Severity levels (e.g., all severity 20+ errors)
  • Performance conditions or WMI Events

In SSMS:
SQL Server Agent >> Alerts  >> (right-click) New Alert  >> choose type and scope.

 

T-SQL Example:

USE msdb;

EXEC msdb.dbo.sp_add_alert  

    @name = N'Error 823 Alert',  

    @message_id = 823,  

    @severity = 0,  

    @enabled = 1,  

    @delay_between_responses = 300, -- 5 minutes 

    @include_event_description_in = 1,  

    @notification_message = N'Disk I/O error (823) detected!';

 

Step 3: Tie It Together

Link the alert to the operator so someone actually gets notified.

In SSMS:
Open the alert >> Response >> “Notify Operators”

 

T-SQL Example:

EXEC msdb.dbo.sp_add_notification  

    @alert_name = N'Error 823 Alert',  

    @operator_name = N'DBA On Call',  

    @notification_method = 1; -- Email

 

 

Step 4: Enable the Mail Profile

Emails won’t get sent without this often-overlooked step.

In SSMS:

SQL Server Agent >> (right-click) >> Properties >> Alert System >> Check “Enable Mail Profile” and pick a profile from the drop down. This required Database mail to be configured and working.

 

Step 5: Cut the Noise

Not every warning deserves an email at 3 a.m. Start with the essentials:

  • Possible Corruption (823, 824, 825)
  • Critical Job failures (Agent jobs)
  • Severity 19+ errors (fatal errors, serious resource issues)
    • Severity 20 may give false positives if you are using vulnerability testing software
  • HADR role changes for unexpected AG failovers

 

Then, test and adjust. If the alerts are noisy, you won’t trust them when it matters.

 

The Bottom Line

Setting up alerts in SQL Server is one of the easiest wins for DBAs. They’re built in, supported by Microsoft, and once tied to Database Mail and operators, they can catch serious issues before your phone rings. Just be selective, as too much noise, and the real signals get lost, or emails get “ruled” into a folder rather than acted on.

 


Free Disaster Readiness Quiz

I’ll trade you an email address for an honest assessment

DR Quiz – Are you ready?


 

Thanks for reading!

 

— Kevin

Filed Under: SQL, Troubleshooting Tagged With: syndicated

“SQL Server Is Slow” Part 4 of 4

October 7, 2025 by Kevin3NF Leave a Comment

Parts 1, 2 and 3 got you to the (SQL) engine room. Now we use community-trusted tools to find what’s going on, fix it safely, and hopefully keep it from coming back.

This post will lean heavily on the First Responder Kit from Brent Ozar, sp_WhoIsActive from Adam Machanic and others. They took what Microsoft provides and made them better. Hundreds or thousands of hours of work to make FREE things for you to use.

This is the most complex blog I’ve ever written. Your experiences may differ, and my code samples might have some bugs. Test first.

The Fork in the Road

After ruling out all the previous items from parts 1,2 and 3, you’ll probably land in one of two branches:

  • Branch A: Obvious & Fixable Now – A misconfiguration, a runaway query, or an ugly blocking chain.
  • Branch B: Systemic & Chronic – Indexing issues, bad query plan, stats/CE/database compat changes, or a new workload exposing weaknesses.

 

First Pulse (rank evidence before touching anything)

Run these back-to-back to see “right now” and “what changed”:

-- 60s wait/CPU/IO snapshot with expert details

EXEC sp_BlitzFirst @Seconds = 60, @ExpertMode = 1;
-- Active requests + blockers + plan handles (save to table if you’ll compare)

EXEC sp_WhoIsActive
     @get_plans = 1,
     @get_additional_info = 1,
     @find_block_leaders = 1;

 

If the server feels resource constrained (CPU/memory/tempdb/log), confirm with Erik Darling’s:

EXEC sp_PressureDetector;

 

Check Query Store (if enabled) for regressed queries over the last few hours. This is database level. Database Z that nobody thinks about could be sitting in the corner making life miserable for Database A.

 

Branch A — Obvious & Fixable Now (surgical)

 

A1) Confirm the bad actor (copy/paste or screenshot what you find)

  • sp_WhoIsActive: highest CPU/reads/duration, blocking leaders, tempdb usage.
    • You may get lucky and see an obvious long running query blocking everything, or hogging CPU
  • sp_BlitzFirst (0s mode) for a quick wait profile on demand:
    • EXEC sp_BlitzFirst @Seconds = 0, @ExpertMode = 1;

 

A2) Get permission & document the plan

  • Who’s affected, what’s the risk, what’s the rollback.

 

A3) Take the action (one change at a time)

  • Kill a true runaway query(after sign-off):
    • KILL <session_id>;
  • Fix low-hanging config already validated in earlier parts: MAXDOP, Cost Threshold, auto-close/auto-shrink OFF, compatibility/CE sanity.
  • Remove a bad plan:

— Remove the specific plan from the cache (sample)

DBCC FREEPROCCACHE (0x060006001ECA270EC0215D05000000000000000000000000);
  • Update some stats
UPDATE STATISTICS dbo.TableName WITH FULLSCAN;  -- or sampled if huge
    • This will purge some plans from the cache, but new plans will have great stats on at least one table.

A4) Re-measure immediately

EXEC sp_BlitzFirst @Seconds = 30, @ExpertMode = 1;

 

Goal: waits normalize, blockers drop, and users confirm latency relief.

Branch B — Systemic & Chronic (rank → fix → re-measure)

 

B1) Rank the biggest fish

EXEC sp_BlitzFirst  @Seconds = 0, @ExpertMode = 1;   -- overall pain profile
EXEC sp_BlitzCache  @SortOrder = 'cpu';   -- try 'reads', 'avg cpu', 'memory grant', ‘avg duration’ etc
EXEC sp_PressureDetector;                -- resource pressure confirmation

Use Query Store timelines to spot regressions and plan churn.

 

B2) Indexing reality check

EXEC sp_BlitzIndex @DatabaseName = 'YourDB', @Mode = 0;  -- database-wide health

Cross-check with native DMVs when you need specifics:

Top impact missing indexes (advisory; validate!)

EXEC sp_BlitzIndex @Mode = 3, @GetAllDatabases = 1, @BringThePain = 1;  -- database-wide health

Unused/rarely used indexes (drop/consolidate after monitoring)

-- Unused Index Script
-- Original Author: Pinal Dave
-- Edit the last Where clause to suit your situation

Create Table #Unused(
                [Database] varchar(1000),
                [Table] varchar (500),
                [IndexName] varchar(500),
                [IndexID] bigint,
                [UserSeek] bigint,
                [UserScans] bigint,
                [UserLookups] bigint,
                [UserUpdates] bigint,
                [TableRows] bigint,
                [Impact] bigint,
                [DropStatement] varchar(1500)
                )

exec sp_MSforeachdb

'use [?]
Insert #Unused

SELECT
                Db_Name(DB_ID()) as [Database]
                ,o.name AS [Table]
                , i.name AS IndexName
                , i.index_id AS IndexID
                , dm_ius.user_seeks AS UserSeek
                , dm_ius.user_scans AS UserScans
                , dm_ius.user_lookups AS UserLookups
                , dm_ius.user_updates AS UserUpdates
                , p.TableRows
                , dm_ius.user_updates * p.TableRows
                , ''DROP INDEX '' + QUOTENAME(i.name)
                + '' on '' + QUOTENAME(Db_Name(DB_ID())) + ''.''
                + QUOTENAME(s.name) + ''.''
                + QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS ''drop statement''
FROM sys.dm_db_index_usage_stats dm_ius
                INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id
                                AND dm_ius.OBJECT_ID = i.OBJECT_ID
                INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
                INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
                INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
                                                                FROM sys.partitions p
                                                                GROUP BY p.index_id, p.OBJECT_ID) p
                                ON p.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE
                OBJECTPROPERTY(dm_ius.OBJECT_ID,''IsUserTable'') = 1
                AND dm_ius.database_id = DB_ID()
                AND i.type_desc = ''nonclustered''
                AND i.is_primary_key = 0
                --AND i.is_unique_constraint = 0
                --AND o.name in (''CloverSummary'')
ORDER BY
                (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
--GO
'
Select *
from #unused
Where 1=1
                --and [IndexName] like '%_DDBA'
                --and [IndexName] IN ('')
                --and [database] Not in ('MSDB','tempdb')
                --and [database] in ('StackOverflow')
                --and UserSeek + UserScans + UserLookups < 1
                --and [Table] in ('')
Order By [Database] asc, UserSeek + userscans + UserLookups, impact desc

Drop table #Unused

 

B3) Bad Queries (identify → inspect plan → fix → handoff)

  • When is a query “bad”? Probably if it ranks in your top 10 by total CPU/reads in the target window, has high avg cost and high executions, triggers spills/memory grants, blocks, or just regressed in Query Store.
    • Using the SortOrder parameter, also check for ‘memory grants’ and ‘avg duration’
  • Find them fast:
EXEC sp_BlitzCache @SortOrder = 'cpu';           -- or 'reads' , 'avg cpu' , ‘avg duration’ etc.

EXEC sp_WhoIsActive @get_plans = 1
  • Classic face-palm: Key Lookup from “one more column”
    • What happens: a nonclustered index is used for a seek but doesn’t cover a new column added to SELECT, so the engine performs one lookup per row. This shows up in the Query Plan as a ‘Key Lookup’ or ‘RID Lookup’ operator.
    • Why it hurts: on hot paths, that lookup loop can hammer CPU/IO and tank a busy server.
    • Fix choices: INCLUDE the missing column(s) on the used index, reshape keys to cover, or trim the SELECT if it isn’t needed.

 

Quick hunt for plans with lookups:

SELECT TOP 50
       DB_NAME(qp.dbid) AS dbname,
       (qs.total_logical_reads*1.0/NULLIF(qs.execution_count,0)) AS avg_reads,
       qs.execution_count, qt.text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE qp.query_plan.exist('//RelOp[@LogicalOp="Key Lookup" or @LogicalOp="RID Lookup"]') = 1
ORDER BY avg_reads DESC;

 

Typical covering fix (example):

CREATE NONCLUSTERED INDEX IX_Sales_OrderDate_CustomerID
  ON dbo.Sales (OrderDate, CustomerID)
  INCLUDE (TotalDue);   -- add the “new SELECT column” here
  • Other common suspects: implicit conversions on join/filter columns, parameter sniffing, spills from oversized sorts/hashes, scalar UDFs/multi-stmt TVFs, RBAR triggers.
  • Developer handoff package (keep it short and useful):
    • Evidence: normalized query text + sample parameters, actual plan (.sqlplan or XML), metrics window (total/avg CPU/reads/duration/executions), warnings (spills, lookups, implicit conversions).
    • Hypothesis & options: e.g., new column caused Key Lookup on PK_MyTable_ID → options: covering INCLUDE, index reshape/filtered index, query/ORM change, or plan stability tactic.
    • Safety: size/impact estimate, rollback (drop index/unforce plan), and success criteria (Query Store deltas, BlitzFirst/Cache snapshots).

 

B4) Plan stability

  • Verify statistics are being updated “often enough” and at the right sample size
  • Look for plan cache bloat and churn
  • Parameter sniffing fixes: targeted OPTION (RECOMPILE), OPTIMIZE FOR cases, or Query Store forced plan (monitor; keep rollback).
  • Memory grants/spills: better indexes (narrow keys + good includes), stats refresh, and watch row-goal operators.

 

B5) Stats/CE/compat sanity

  • Ensure AUTO_UPDATE_STATISTICS (and consider ASYNC) fit the workload.
  • Recent compat level or CE changes? Compare before/after in Query Store.

 

B6) Parallelism & CPU policy

  • Validate MAXDOP and Cost Threshold against core count + workload mix.
  • Use BlitzCache to spot skewed exchanges or thrashy parallel plans.

 

B7) Tempdb & log health

  • sp_PressureDetector will flag contention/pressure; confirm with file IO stats:
SELECT DB_NAME(vfs.database_id) AS dbname, mf.type_desc, mf.physical_name,
       vfs.num_of_reads, vfs.num_of_writes,
       vfs.io_stall_read_ms, vfs.io_stall_write_ms
FROM sys.dm_io_virtual_file_stats(NULL,NULL) vfs
JOIN sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
ORDER BY (vfs.io_stall_read_ms + vfs.io_stall_write_ms) DESC;
  • Right-size VLFs (all active databases), ensure multiple equally sized tempdb data files, and watch version store if RCSI/SI is on.

 

B8) New workload exposure

  • Correlate Query Store/BlitzFirst windows with deploys, ORM queries, reporting jobs, ETL shifts, or seasonal peaks. Fix that pattern first (indexing, parameterization, caching, schedule changes).

 

Change Control & Safety Net

  • One change at a time → measure → document → keep/rollback.
  • Save “before/after” artifacts: sp_WhoIsActive snapshots (to table), BlitzFirst output, BlitzCache exports, Query Store screenshots/DMVs.
  • Always include who/when/why and expected KPI movement.
  • Assume someone will want a post-mortem. I wrote one this morning for a client outage.

 

What “Good” Looks Like

  • Waits shift away from bottleneck classes to benign background.
  • Top 10 statements show reduced CPU/reads; fewer/shorter blocking chains.
  • Tempdb/log growth stabilize; fewer off-hours alerts.
  • Users say “fast enough,” matched to your baseline and SLAs.

 

Keep It from Coming Back

  • Maintain baselines (perf counters, waits, file IO, Query Store top queries).
  • Align index & stats maintenance to your workload.
  • Add deploy gates for schema/index/compat changes with pre/post metrics.
  • Keep lightweight Extended Event sessions for spills, long-running, parameter-sensitive queries.
  • Review Query Store regressions and any forced plan safety periodically.

 

The Bottom Line:

Slow SQL isn’t mysterious; it’s measurable. Rank the pain, fix the biggest offender or the pattern behind it, and prove the result with before/after metrics. Keep notes as you go, and be methodical,

Above all…be calm. Everyone else can panic while you be the hero.


First Month of Pocket DBA® Free!

Pocket DBA


— Thanks for reading!

Filed Under: Indexing Strategies, Performance Tuning, SQL, Troubleshooting Tagged With: syndicated

  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Interim pages omitted …
  • Go to page 37
  • Go to Next Page »

Primary Sidebar

Search

Sign up for blogs, DBA availability and more!

Home Blog About Privacy Policy
  • Home-draft
  • Blog
  • About Us

Copyright © 2026 · WordPress · Log in

 

Loading Comments...