SQL Server Database Index Performance
Auditing Index Use is Not an Easy Task, But Critical to Your Server’s Performance.
When it comes to auditing index use in SQL Server databases, I sometimes get overwhelmed. For example, how to do you go about auditing indexes in a database with over 1,500 tables? While auditing a single index is relatively straight-forward, auditing thousands of them in multiple databases is not an easy task. Whether the task is easy or not, it is an important task if you want to optimize the performance of your SQL Server databases.
There are two different ways to approach the task of auditing large numbers of indexes. One option is to break down the chore into smaller, more manageable units, first focusing on those indexes that are most likely to affect the overall performance of your SQL Server. For example, you might want to start your audit on the busiest database on your server, and if that database has many tables, first start on those tables with the most data, and then working down to other tables with less data. This way, you will focus your initial efforts in areas where it will most likely have the great positive impact on your server’s performance.
Another option, and the one I generally follow (because I am somewhat lazy), is to use a more of a “management by exception” approach. What I mean by this is that if I don’t see any performance problems in a database, there is not much use in evaluating every index in the database. But if a database is demonstrating performance problems, then there is a good chance that indexes are less than optimal, and that I should pay extra attention to them, especially if the databases are mission critical. And if there are a lot of indexes to audit, then I start by focusing on the largest ones first, as they are the ones most likely to cause performance problems. For example, in the case of the database with 1,500 tables, I only audited about a dozen of them carefully (all very large), as they were the ones I felt needed the most attention.
However you decide to audit the indexes in the databases you manage, you need to come up with a sound plan and carry it out in a systematic way.
As you may have already noticed, the audit checklist I have provided above is not long. This is intentional. Remember, the goal of this article series on doing a performance audit is to identify the “easy” and “obvious” performance issues, not to find them all. The ones that I have listed above will get you a long way to identifying and correcting the easy index-related performance problems. Once you have gotten these out of the way, then you can spend time on tougher ones. For example, this website has many index-related tips, many of them very advanced, on these topics:
If you have not done so yet, you will want to review each of these tips pages.
Have You Run the Index Tuning Wizard Recently?
One of the best tools that Microsoft has given us in SQL Server 7.0 and 2000 is the Index Tuning Wizard. It is not a perfect tool, but it does help you to identify existing indexes that aren’t being used, along with recommending new indexes that can be used to help speed up queries. If you are using SQL Server 2000, it can also recommend the use of Indexed Views. It uses the actual queries you are running in your database, so its recommendations are based on how your database is really being used. The queries it needs for analysis come from the SQL Server Profiler traces you create.
One of the first things I do when doing a performance audit on a new SQL Server is to capture a trace of server activity and run the Index Tuning Wizard against it. In many cases, it can help me to quickly identify any indexes that are not being used and can be deleted, and to identify new indexes that should be added in order to boost the database’s performance.
Here are some tips for using the Index Tuning Wizard when auditing a SQL Server database’s indexes:
If the Wizard recommends adding new indexes, you will want to evaluate them, and also compare them to the currently existing indexes on the table to see if they make sense and might potentially cause new problems. For example, a recommended index might help a particular query, but it may also slow down a common INSERT operation this is performed thousands of times each hour. The Wizard can’t know this, and you must decide what is more important, some queries that run a little faster and INSERTs that run a little slower, or vice versa.
And last of all, even if the Index Tuning Wizard doesn’t recommend any new indexes, this doesn’t mean that no new indexes are needed, only that based on the trace data that was analyzed that it didn’t recommend any. You might want to consider running several traces over several days in order to get an even wider sample of the data in order to better help identify necessary indexes. And even then, the Index Tuning Wizard can’t find all the needed indexes, but it will find all the obviously needed ones.
Once you have performed your analysis and made the recommended changes, I recommend that you do another trace and analysis in order to see what affect your changes made. Also keep in mind that using the Index Wizard Analysis is not a one time event. The underlying data in a database changes over time, along with the types of queries run. So you should make it a point to take traces and run analyses periodically on your servers to keep them in regular tune.
Does Every Table in Each Databases Have a Clustered Index?
As a rule of thumb, every table in every database should have a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases–such as an identity column, or some other column where the value is increasing–and is unique. In many cases, the primary key is the ideal column for a clustered index.
If you have any experience with performance tuning SQL Server 6.5, you may have heard that is not a good idea to add a clustered index to a column that monotonically increases because it can cause a “hotspot” on the disk that can cause performance problems. That advice is true in SQL Server 6.5.
In SQL Server 7.0 and 2000, “hotspots” aren’t generally a problem. You would have to have over 1,000 transactions a second before a “hotspot” were to negatively affect performance. In fact, a “hotspot” can be beneficial under these circumstances because it eliminates page splits.
Here’s why. If you are inserting new rows into a table that has a clustered index as its primary key, and the key monotonically increases, these means that each INSERT will physically occur one after another on the disk. Because of this, page splits won’t occur, which in itself saves overhead. This is because SQL Server has the ability to determine if data being inserted into a table has a monotonically increasing sequence, and won’t perform page splits when this happens.
If you are inserting a lot of rows into a heap (a table without a clustered index), data is not inserted in any particular order onto data pages, whether the data is monotonically or not monotonically increasing. This results in SQL Server having to work harder (more reads) to access the data when requested from disk. On the other hand, if a clustered index is added to a table, data is inserted sequentially on data pages, and generally less disk I/O is required to retrieve the data when requested from disk.
If data is inserted into a clustered index in more or less random order, data is often inserted randomly into data pages, which is similar to the problem of inserting data into a heap, which contributes to page splits.
So again, the overall best recommendation is to add a clustered index to a column that monotonically increases (assuming there is a column that does so), for best overall performance. This is especially true if the table is subject to many INSERTS, UPDATES, and DELETES. But if a table is subject to few data modification, but to many SELECT statements, then this advice is less useful, and other options for the clustered index should be considered.
As part of your index audit, check to see if every table in your databases has an index or not. If there is no index at all, seriously consider adding a clustered index, based on the advice provided above. There is virtually no downside to adding a clustered index to a table that does not already have one.
Are Any of the Columns in Any Table Indexed More than Once?
This may sound like obvious advice, but it is more common than you think, especially if a database has been around for awhile and it has been administered by multiple DBAs. SQL Server doesn’t care if you do this, as long as the names of the indexes are different. So as you examine your table’s current indexes, check to see if any columns have unnecessary duplicate indexes. Removing them not only reduces disk space, but speeds up all data access or modification to that table.
One common example of duplicate indexes is forgetting that columns that have a primary key, or that are specified as unique, are automatically indexed, and then indexing them again under different index names.
Are There Any Indexes that are Not Being Used in Queries?
This is another obvious piece of advice, but is also a common problem, especially if the initial indexes created for the database were “guessed at” by DBAs or developers before the database went into production. Just looking at a table’s indexes won’t tell you if they index is being used or not, so identifying unused indexes is not always easy.
One of the best ways to identify unused indexes is to use the Index Tuning Wizard, which was previously discussed.
Unnecessary indexes, just like duplicate indexes, wastes disk space and contribute to less than optimal data access and modification performance.
Are the Indexes too Wide?
The wider an index, the bigger the index becomes physically, and the more work SQL Server has to perform when accessing or modifying data. Because of this, you should avoid adding indexes to very wide columns. The narrower the index, the faster it will perform.
In addition, composite indexes, that include two or more columns, also present the same potential problem. Generally, composite indexes should be avoided, if at all possible. Often, the heavy use of composite indexes in a database means that the database design is flawed.
You can’t always avoid indexing wide columns or using composite indexes, but if you think you have to use one, be sure you have carefully evaluated your choice and are confident you don’t have other options that may offer better performance.
Are Tables That are JOINed Have the Appropriate Indexes on the JOINed Columns?
In essence, the column (or columns) used in tables being JOINed should be indexed for best performance. This is straight-forward advice and fairly obvious, but auditing your indexes for optimal JOIN performance is not easy, as you must be familiar with all the JOINs being performed in your database in order to fully perform the audit.
Many people, when creating primary key/foreign key relationships (which are often used in JOINs) forget that while an index is automatically created on the primary key column(s) when it is established, an index for a foreign key is not automatically created, and must be created manually if there is to be one.
Because this is often forgotten, as part of your audit, you may want to identify all primary key/foreign key relationships in your tables and then verify that each foreign key column has an appropriate index.
Besides this, you can also use the Index Tuning Wizard can help identifying missing JOIN indexes, but I have found that the Wizard doesn’t always identify missing indexes for JOINed tables. When it comes right down to it, unless you know the types of common JOINs being run against your database, it is tough to identify all the columns that could benefit from an appropriate index.
Are the Indexes Unique Enough to be Useful?
Just because a table has one or more indexes doesn’t mean that the SQL Server Query Analyzer will use them. Before they are used, the Query Optimizer has to consider them useful. If a column in a table is not at least 95% unique, then most likely the query optimizer will not use an available non-clustered index based on that column. Because of this, don’t add non-clustered indexes to columns that aren’t at least 95% unique. For example, a column with “yes” or “no” as the data won’t be at least 95% unique, and creating an index on that column would in essence create an index that would never be used, which we have already learned puts a drag on performance.
For more information on selectivity, see this article.
As part of your audit, consider “eye-balling” the data in your tables. In other words, take a look at the data residing in your tables, and take an extra look at the columns that are indexed. Generally, it is very obvious if the data in a column is selective or not. If you notice that the data is all “male or female,” “y” or “n,” and so on, then this data is not selective and any indexes created on them will be a waste of space and a hindrance to performance.
Are You Taking Advantage of Covering Indexes?
A covering index, which is a form of a composite index, includes all of the columns referenced in the SELECT, JOIN, and WHERE clauses of a query. Because of this, the index contains the data you are looking for and SQL Server doesn’t have to look up the actual data in the table, reducing logical and/or physical I/O, thus boosting performance. While non-covering composite indexes can hinder performance, covering composite indexes can be very useful, and in many cases, really boost the performance of a query.
The hard part is to identify where covering indexes might be best used. While the Index Tuning Wizard will help, it will still miss a lot of opportunities where covering indexes can be useful. Otherwise, the only way to identify if they will be useful is to carefully examine all the common queries that run are across your database, which of course is near impossible, unless you are really, really bored and have nothing better to do.
At this point, in your audit, the goal should not be to identify new covering indexes as such, but to be aware of them so you can take advantage of them when you run across a situation where they will be helpful.
How Often are Indexes Rebuilt?
Over time, indexes get fragmented, which causes SQL Server to work harder in order to access them, hurting performance. The only solution to this is to defragment all indexes in your databases on a regular basis. There are a variety of ways to do this, and the how-to process won’t be discussed here, as this is explained elsewhere on this website and in the SQL Server Books Online.
The goal of your audit is to find out whether or not the indexes in the databases you are auditing are being defragmented on a regular basis. How often you defragment them can range from daily, weekly, or even monthly, and depends on how often modifications are done, along with the size of the database. If a database has many modifications made daily, then defragmentation should be performed more often. If a database is very large, this means the defragmentation will take longer, which may mean that it cannot be performed as often because the defragmentation process takes too many resources and negatively affects users. As part of your audit, you may want to also evaluate how often the fragmentation is currently being done, and to find out if this is the optimal frequency.
At the very least, if indexes aren’t currently being rebuilt now, they need to be, and as part of your audit, you need to ensure that some sort of an appropriate index rebuilding scheme is put into place.
What is Your Index Fillfactor?
Closely related to index rebuilding is the fillfactor. When you create a new index, or rebuild an existing index, you can specify a fillfactor, which is the amount the data pages in the index are filled when they are created. A fillfactor of 100 means that each index page is 100% full, a fillfactor of 50% means each index page is 50% full.
If you create a clustered index (on a non-monotonically ascending column) that has a fillfactor of 100, that means that each time a record is inserted (or perhaps updated), page splits will occur because there is no room for the data in the existing pages. Numerous page splits can slow down SQL Server’s performance.
Here’s an example: Assume that you have just created a new index on a table with the default fillfactor. When SQL Server creates the index, it places the index on contiguous physical pages, which allows optimal I/O access because the data can be read sequentially. But as the table grows and changes with INSERTS, UPDATES, and DELETES, page splitting occurs. When pages split, SQL Server must allocate new pages elsewhere on the disk, and these new pages are not contiguous with the original physical pages. Because of this, random I/O, not sequential I/O access must be used, which is much slower, to access the index pages.
So what is the ideal fillfactor? It depends on the ratio of reads to writes that your application makes to your SQL Server tables. As a rule of thumb, follow these guidelines:
You may have to experiment to find the optimum fillfactor for your particular application. Don’t assume that a low fillfactor is always better than a high fillfactor. While page splits will be reduced with a low fillfactor, it also increase the number of pages that have to be read by SQL Server during queries, which reduces performance. And not only is I/O overhead increased with a too low of fillfactor, it also affects your buffer cache. As data pages are moved in from disk to the buffer, the entire page (including empty space) is moved to the buffer. So the lower the fillfactor, the more pages that have to be moved into SQL Serve’s buffer, which means there is less room for other important data pages to reside at the same time, which can reduce performance.
If you don’t specify a fillfactor, the default fillfactor is 0, which means the same as a 100% fillfactor, (the leaf pages of the index are filled 100%, but there is some room left on intermediate index pages).
As part of your audit process, you need to determine what fillfactor is being used to create new indexes and rebuild current indexes. In virtually all cases, except for read-only databases, the default value of 0 is not appropriate. Instead, you will want a fillfactor that leaves an appropriate amount of free space, as discussed above.
Your goal should be to perform this part of the performance audit, described on this page, for each of the tables in each of databases in each of your SQL Servers (that sounds like, and it is, a lot of work), and then use this information to make changes as appropriate.
Once you have completed this part of the performance audit, you are now ready to take a look at how Transact-SQL and your application can affect performance.
*Originally published at SQL-Server-Performance.com
Brad M. McGehee is a full-time DBA with a large manufacturing company, and the publisher of http://www.SQL-Server-Performance.Com, a website specializing in SQL Server performance tuning and clustering.
He is an MVP, MCSE+I, MCSD, and MCT (former).
Brad also runs another website called http://www.WorldClassGear.com It provides independent gear reviews for backpackers, trekkers, and adventure travelers.