T-SQL Tuesday is a monthly blog party for the SQL Server community. It is the brainchild of Adam Machanic (b|t) and this month’s edition is hosted by Bert Wagner (b|t) who has asked us to “write about code you’ve written that you would hate to live without.”
Interesting. I go into my toolbox of code/scripts every day. All day long. Oddly, most of the things I’ve written are simple things I don’t want to re-type, or are client specific (some going back to 2006 for a custom log shipping troubleshooting process…). A great many of them are written by others…most often in my case I use things from Adam, Brent and Pinal. Some are customized, most not.
Recently, I’ve been getting a lot of performance tuning work, much of which is basically “things are slow…can you fix them?” type of requests. Most experienced DBAs know that there a few gazillion factors that can lead to this request, and I won’t re-hash them here.
Lets assume, that we’ve optimzed or eliminated the server, disks, network, etc. and were now looking at SQL code and everyone’s favorite – indexes.
I have two scripts I use that give me a quick overview of the type of work SQL Server is doing. These complement each other, and are used AS A STARTING POINT to locate the low-hanging fruit that can be causing excessive work for the disks and memory of the server.
First off – lets look at any HEAPS and Clustered indexes are getting full scans (this is not perfect info…just a starting point remember):
-- base query provided by Jonathan Fite: -- https://dba.stackexchange.com/questions/210341/number-of-full-table-scans -- Added sys.Partitions for rowcount, Imapct computed column, Where conditions and Sorting -- Kevin3NF, https://dallasdbas.com exec sp_MSforeachdb 'use [?] --for just one database, choose the database context and simply highlight andrun the below query SELECT db_name() , S.name AS SchemaName , O.name AS ObjectName , I.name AS IndexName --, I.type AS IndexType , I.type_desc AS IndexType_Desc , IUS.last_user_scan , IUS.user_scans , p.rows , IUS.user_scans * p.rows as [Impact] FROM sys.objects AS O INNER JOIN sys.schemas AS S ON S.schema_id = O.schema_id INNER JOIN sys.indexes AS I ON I.object_id = O.object_id INNER JOIN sys.dm_db_index_usage_stats AS IUS ON IUS.index_id = I.index_id AND IUS.object_id = I.object_id JOIN sys.partitions p on p.object_id = I.object_id and p.index_id = I.index_id WHERE 1=1 and I.type IN (0,1) --0=HEAP, 1=CLUSTERED (table) and IUS.user_scans > 5 and p.rows > 1000 Order by --IUS.user_scans desc --p.Rows Desc IUS.user_scans * p.rows DESC '
When I run this in its entirety, I get a section of results for every database, for any scans returned by sys.dm_db_index_usage_stats. I can quickly scroll up and down that result set and eyeball any massive numbers for the database(s) in question, or all of them for the server as a whole. In a sample server I have I found 2 databases with crazy numbers, so I copied the results for those two into Excel and sorted by the Impact column (which is just user_scans * rows):
As you can see on DatabaseB there is a 27 million row table that gets scanned…a lot. I know some of this is a daily copy from one DB to another…but not that many, so I need to go find out what else is causing these. While I’ve been writing this, it has been scanned 2000 more times. On DatabaseA, I have 3 tables I want to look into…2 of them due to sheer size of the tables and one because it is hit so frequently. In the SQL Query results, the last_user_scan column shows current date and time for all of these…copying to Excel didn’t give me the time…just the date, but that is just a formatting issue.
Note that you can change the results in the WHERE clause to suit your needs.
My next step in this process is to run a Query I based off the Pinal Dave’s (b|t) Missing Index query, which simply gets the Top 5 Missing Indxes for each database, ordered by Impact:
-- Get the top 5 missing indexes for each database on an instance -- and output to a format that can be read as-is, or pasted to Excel for filtering, etc. -- Thanks to Pinal Dave at www.sqlauthority.com for providing the base query this is built on. -- Pinal is good guy and great teacher...you should hire him for your performance issues -- or me, lol -- Test in your Dev environment before trying this in Prod. -- I make no guarantees on performance or SQL version compatibility -- Filtering out various things is done in the WHERE clause (impact, last_seek, etc.) -- Create a temp table to hold the results: CREATE TABLE [dbo].[#MI]( [dbname] [nvarchar](128) NULL, [object_id] [nvarchar](255) NOT NULL, [improvement_measure] [bigint] NULL, [create_index_statement] [nvarchar](4000) NULL, [group_handle] [bigint] NOT NULL, [unique_compiles] [bigint] NOT NULL, [user_seeks] [bigint] NOT NULL, [user_scans] [bigint] NOT NULL, [last_user_seek] [datetime] NULL, [last_user_scan] [datetime] NULL, [avg_total_user_cost] [float] NULL, [avg_user_impact] [float] NULL, [system_seeks] [bigint] NOT NULL, [system_scans] [bigint] NOT NULL, [last_system_seek] [datetime] NULL, [last_system_scan] [datetime] NULL, [avg_total_system_cost] [float] NULL, [avg_system_impact] [float] NULL, [database_id] [int] NOT NULL ) -- Run through each db on the instance and record results to the temp table -- yes, sp_MSforeachdb is both undocumented and unsupported, but for this it works just fine exec master.sys.sp_MSforeachdb ' use [?]; INSERT #MI SELECT top 5 db_name(mid.database_id) as [dbname], Object_name(mid.[object_id]), Cast(((migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans)))as INT) AS [improvement_measure], ''CREATE INDEX [missing_index_'' + CONVERT (varchar, mig.index_group_handle) + ''_'' + CONVERT (varchar, mid.index_handle) + ''_'' + LEFT (PARSENAME(mid.statement, 1), 32) + '']'' + '' ON '' + mid.statement + '' ('' + ISNULL (mid.equality_columns,'''') + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN '','' ELSE '''' END + ISNULL (mid.inequality_columns, '''') + '')'' + ISNULL ('' INCLUDE ('' + mid.included_columns + '')'', '''') AS create_index_statement, migs.*, mid.database_id FROM sys.dm_db_missing_index_groups mig INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle WHERE 1=1 --and migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 40000 --uncomment to filter by improvement measure --and migs.user_seeks > 250 -- change this for activity level of the index, leave commented out to get back everything and mid.database_id > 4 --Skip system databases and db_name(mid.database_id) = ''?'' --Get top 5 for only the current database ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC ' --now that the table is populated, show me the data! --no data for a DB means no index recommendations. Select [dbname] ,[object_id] ,[Improvement_Measure] ,[create_index_statement] ,[user_seeks] ,[user_scans] ,[last_user_seek] ,[last_user_scan] ,[avg_total_user_cost] ,[avg_user_impact] From #MI Where 1=1 Order by [dbname],[improvement_measure] desc --Clean up your mess, you weren't raised in a barn! Drop Table #MI
Again, use the WHERE clause(s) to filter as you see fit. In my case I see only one of the tables from the FULL SCAN query here, so I already know I may have a good shot at seeing at least a valid place to start looking:
I use the Missing Index script all the time, in a variety of ways. Also, Unused indexes to see if newly created ones are actually in play and being used by the Query Optimizer.
And that’s all I have to say about that…except: USE THESE AT YOUR OWN RISK AND LEARN WHAT THEY DO FIRST! 🙂
Thanks for reading!
[…] Kevin Hill has a couple scripts which help him find easy performance gains: […]
[…] Kevin Hill shares two scripts he uses for finding low-hanging index optimization fruit: one that finds queries performing heap or clustered index scans, and another that returns the top 5 missing indexes per database. […]