T-SQL Tuesday: Code You Would Hate To Live Without

 

T-SQL Tuesday is a monthly blog party for the SQL Server community. It is the brainchild of Adam Machanic (b|t) and this month’s edition is hosted by Bert Wagner (b|t) who has asked us to “write about code you’ve written that you would hate to live without.”

Interesting.  I go into my toolbox of code/scripts every day.  All day long.  Oddly, most of the things I’ve written are simple things I don’t want to re-type, or are client specific (some going back to 2006 for a custom log shipping troubleshooting process…).  A great many of them are written by others…most often in my case I use things from Adam, Brent and Pinal.  Some are customized, most not.

Recently, I’ve been getting a lot of performance tuning work, much of which is basically “things are slow…can you fix them?” type of requests.  Most experienced DBAs know that there a few gazillion factors that can lead to this request, and I won’t re-hash them here.

Lets assume, that we’ve optimzed or eliminated the server, disks, network, etc. and were now looking at SQL code and everyone’s favorite – indexes.

I have two scripts I use that give me a quick overview of the type of work SQL Server is doing.   These complement each other, and are used AS A STARTING POINT to locate the low-hanging fruit that can be causing excessive work for the disks and memory of the server.

First off – lets look at any HEAPS and Clustered indexes are getting full scans (this is not perfect info…just a starting point remember):

-- base query provided by Jonathan Fite:
-- https://dba.stackexchange.com/questions/210341/number-of-full-table-scans
-- Added sys.Partitions for rowcount, Imapct computed column, Where conditions and Sorting
-- Kevin3NF, https://dallasdbas.com

exec sp_MSforeachdb
'use [?]

--for just one database, choose the database context and simply highlight andrun the below query
SELECT
	db_name() 
	, S.name AS SchemaName
    , O.name AS ObjectName
    , I.name AS IndexName
    --, I.type AS IndexType
    , I.type_desc AS IndexType_Desc
    , IUS.last_user_scan
	, IUS.user_scans
	, p.rows
	, IUS.user_scans * p.rows as [Impact]
FROM sys.objects AS O 
    INNER JOIN sys.schemas AS S ON S.schema_id = O.schema_id
    INNER JOIN sys.indexes AS I ON I.object_id = O.object_id
    INNER JOIN sys.dm_db_index_usage_stats AS IUS ON IUS.index_id = I.index_id AND IUS.object_id = I.object_id
	JOIN sys.partitions p on p.object_id = I.object_id and p.index_id = I.index_id
WHERE 1=1 
	and I.type IN (0,1) --0=HEAP, 1=CLUSTERED (table)
	and IUS.user_scans > 5
	and p.rows > 1000
Order by 
	--IUS.user_scans desc
	--p.Rows Desc
	IUS.user_scans * p.rows DESC
'

When I run this in its entirety, I get a section of results for every database, for any scans returned by sys.dm_db_index_usage_stats. I can quickly scroll up and down that result set and eyeball any massive numbers for the database(s) in question, or all of them for the server as a whole.  In a sample server I have I found 2 databases with crazy numbers, so I copied the results for those two into Excel and sorted by the Impact column (which is just user_scans * rows):

Results worth Investigating

As you can see on DatabaseB there is a 27 million row table that gets scanned…a lot.  I know some of this is a daily copy from one DB to another…but not that many, so I need to go find out what else is causing these.  While I’ve been writing this, it has been scanned 2000 more times.  On DatabaseA, I have 3 tables I want to look into…2 of them due to sheer size of the tables and one because it is hit so frequently.  In the SQL Query results, the last_user_scan column shows current date and time for all of these…copying to Excel didn’t give me the time…just the date, but that is just a formatting issue.

Note that you can change the results in the WHERE clause to suit your needs.

 

My next step in this process is to run a Query I based off the Pinal Dave’s (b|tMissing Index query, which simply gets the Top 5 Missing Indxes for each database, ordered by Impact:

-- Get the top 5 missing indexes for each database on an instance
-- and output to a format that can be read as-is, or pasted to Excel for filtering, etc.

-- Thanks to Pinal Dave at www.sqlauthority.com for providing the base query this is built on.
-- Pinal is good guy and great teacher...you should hire him for your performance issues
-- or me, lol

-- Test in your Dev environment before trying this in Prod.  
-- I make no guarantees on performance or SQL version compatibility

-- Filtering out various things is done in the WHERE clause (impact, last_seek, etc.)


-- Create a temp table to hold the results:
CREATE TABLE [dbo].[#MI](
	[dbname] [nvarchar](128) NULL,
	[object_id] [nvarchar](255) NOT NULL,
	[improvement_measure] [bigint] NULL,
	[create_index_statement] [nvarchar](4000) NULL,
	[group_handle] [bigint] NOT NULL,
	[unique_compiles] [bigint] NOT NULL,
	[user_seeks] [bigint] NOT NULL,
	[user_scans] [bigint] NOT NULL,
	[last_user_seek] [datetime] NULL,
	[last_user_scan] [datetime] NULL,
	[avg_total_user_cost] [float] NULL,
	[avg_user_impact] [float] NULL,
	[system_seeks] [bigint] NOT NULL,
	[system_scans] [bigint] NOT NULL,
	[last_system_seek] [datetime] NULL,
	[last_system_scan] [datetime] NULL,
	[avg_total_system_cost] [float] NULL,
	[avg_system_impact] [float] NULL,
	[database_id] [int] NOT NULL	
) 

-- Run through each db on the instance and record results to the temp table
-- yes, sp_MSforeachdb is both undocumented and unsupported, but for this it works just fine

exec master.sys.sp_MSforeachdb
'  use [?];
INSERT #MI
SELECT top 5
	db_name(mid.database_id) as [dbname],
	Object_name(mid.[object_id]),
	Cast(((migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans)))as INT) AS [improvement_measure],
	''CREATE INDEX [missing_index_'' + CONVERT (varchar, mig.index_group_handle) + ''_'' + CONVERT (varchar, mid.index_handle)
	+ ''_'' + LEFT (PARSENAME(mid.statement, 1), 32) + '']''
	+ '' ON '' + mid.statement
	+ '' ('' + ISNULL (mid.equality_columns,'''')
	+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN '','' ELSE '''' END
	+ ISNULL (mid.inequality_columns, '''')
	+ '')''
	+ ISNULL ('' INCLUDE ('' + mid.included_columns + '')'', '''') AS create_index_statement,
	migs.*, 
	mid.database_id
FROM sys.dm_db_missing_index_groups mig
	INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
	INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
WHERE 1=1
	--and migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 40000 --uncomment to filter by improvement measure
	--and migs.user_seeks > 250									-- change this for activity level of the index, leave commented out to get back everything
	and mid.database_id > 4										--Skip system databases
	and db_name(mid.database_id) = ''?''						--Get top 5 for only the current database
ORDER BY 
	migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC
'

--now that the table is populated, show me the data!
--no data for a DB means no index recommendations.

Select
	[dbname] 
	,[object_id] 
	,[Improvement_Measure]
	,[create_index_statement]
	,[user_seeks]
	,[user_scans]
	,[last_user_seek]
	,[last_user_scan]
	,[avg_total_user_cost]
	,[avg_user_impact]
From #MI
Where 1=1
Order by [dbname],[improvement_measure] desc

--Clean up your mess, you weren't raised in a barn!
Drop Table #MI

Again, use the WHERE clause(s) to filter as you see fit.  In my case I see only one of the tables  from the FULL SCAN query here, so I already know I may have a good shot at seeing at least a valid place to start looking:

I use the Missing Index script all the time, in a variety of ways.   Also, Unused indexes to see if newly created ones are actually in play and being used by the Query Optimizer.

And that’s all I have to say about that…except: USE THESE AT YOUR OWN RISK AND LEARN WHAT THEY DO FIRST! 🙂

Thanks for reading!

Kevin

Leave a Comment

Sign up for our Newsletter

%d bloggers like this: