SQL SERVER INDEXES
Definition
Indexes are one of the biggest determinate of database performance. Indexes allow you to speed query performance on commonly used columns and improve the overall processing speed of your database.
Types
Microsoft SQL Server supports two types of indexes:
Clustered indexes define the physical sorting of a database table’s rows in the storage media. For this reason, each database table may have only one clustered index.
Non-clustered indexes are created outside of the database table and contain a sorted list of references to the table itself.
When and Where
Each database table can have only one clustered index. If a PRIMARY KEY constraint is created for a database table and no clustered index currently exists for that table, SQL Server automatically creates a clustered index on the primary key. SQL Server supports a maximum of 249 non-clustered indexes per table. However, it’s important to keep in mind that non-clustered indexes slow down the data modification and insertion process, so indexes should be kept to a minimum.
We can get potential performance benefits only through the judicious use of indexes on database tables. Indexes will allow you to speed query performance on commonly used columns and improve the overall processing speed of your database. But the un-necessary use of index can adversely affect the performance of the application.
One of the hardest tasks is the selection of appropriate columns for clustered or non-clustered indexes. When reviewing what type of index to create, you should identify the data type and the column(s) stores. Also, you must consider what query types will run and the frequency with which they will be executed. We can use following guidelines for this:
- When tables use primary keys, SQL Server automatically (by default) creates a unique cluster index on the column(s) comprising the key. Clearly, the uniqueness of the primary key is enforced through the creation of the unique index on the column(s).
- Columns that contain a relatively high degree of duplicate values and that are accessed in sequence are good candidates for clustered indexes because SQL Server physically reorders the data rows and stores the values in either ascending (the default) or descending order so they can be quickly retrieved.
- Columns that are searched in a range are good candidates for clustered indexes.
Columns that contain many unique values are good candidates for non-clustered indexes. - Consider creating non-clustered indexes on any columns that are frequently referenced in the WHERE clauses of SQL statements.
- Columns referenced by JOIN and GROUP BY operations are good candidates for non-clustered indexes.
- We can also consider creating non-clustered indexes that cover all of the columns used by certain frequently issued queries. These queries are referred to as “covered queries” and experience excellent performance gains.
- When creating foreign key relationships, it’s a good idea to create a non-clustered index on the foreign key column if you’re planning on using it frequently in joins.
Tables that have a clustered index maintain the relationship between the data pages via a linked list (e.g., leaf and non-leaf level). Conversely, if no clustered index exists on a table, SQL Server will store the data pages in a heap.
Data pages
When an index is created, SQL Server creates data pages, which are pointers that assist in finding the data row that contains the information of interest. When the index is established, a fillfactor is set. The purpose of a fillfactor is to designate the percentage of the data pages filled upon index creation. Over time, the free space is consumed as modifications occur, which causes page splits. The result of the page splits is that they degrade the performance of the indexes and thus the queries that utilize them by causing the data storage to become fragmented. The index’s fillfactor is set at the time of the indexes creation and isn’t dynamically maintained.
To update the fillfactor in the data pages, we can drop and recreate the indexes and reset the fillfactor. Keep in mind that this will negatively impact concurrent database activity and should be used judiciously in production systems. DBCC INDEXDEFRAG and DBCC DBREINDEX are statements that defragment both clustered and non-clustered indexes. INDEXDEFRAG is an online operation (i.e., it does not block other table activity, such as a query), whereas DBREINDEX physically rebuilds the index(s). In most cases, rebuilding an index achieves greater defragmentation, but it comes at the cost of blocking concurrent activity on that table. INDEXDEFRAG can take longer to complete when large fragmented indexes are present because it executes in small transactional blocks.
The Fillfactor
When you perform one of these actions, the database engine can more efficiently return indexed data. Fillfactor tweaking is beyond the scope of this article, but should be employed with a careful eye toward the intended usage of the table for which the index is created.
SQL Server dynamically chooses which indexes to use during the execution of a query. To make this choice, SQL Server uses statistics about the distribution of keys in each index to determine which indexes it will use to process the query. It is essential to consider that the statistics SQL Server is using can become out of date during the course of normal database activity (e.g., inserts, deletes, and updates on the table). To ascertain the current status of your statistics, you can execute DBCC SHOWCONTIG. When you determine that your statistics are out of date, you should run the UPDATE STATISTICS statement on the table to allow SQL Server to refresh its information about the indexes.
Establish a plan
SQL Server provides a utility that simplifies and automates the maintenance of a database, including the indexes. This tool is called the Database Maintenance Plan Wizard (DMPW). If you run this wizard, you’ll see that you can schedule the statistics on the indexes in the database to be updated as a regularly scheduled job, thus alleviating the burden of manually rebuilding your indexes. Another mutually exclusive choice in the DMPW is to re-organise data and data pages, which effectively drops and recreates your indexes with a specified fill factor.
Index Tuning Wizard
SQL Server provides a wonderful facility known as the Index Tuning Wizard which greatly enhances the index selection process. To use this tool, first use SQL Profiler to capture a trace of the activity for which you wish to optimize performance. You may wish to run the trace for an extended period of time to capture a wide range of activity. Then, using Enterprise Manager, start the Index Tuning Wizard and instruct it to recommend indexes based upon the captured trace. It will not only suggest appropriate columns for queries but also provide you with an estimate of the performance increase you’ll experience after making those changes!