Some SQL-Server Performance Enhancement Tips

February 25, 2013 | balvvant2006 | Application Performance, Database

Choose the Appropriate Data Types
Use Triggers Cautiously

Use views and stored procedures instead of heavy queries.

It reduces network traffic, because client will send to server only stored procedure or view name (in certain cases heavy-duty queries might degrade performance up to 70%) instead of large heavy-duty queries text. This also facilitates permission management as you can restrict user access to table columns.

Use constraints instead of triggers

Constraints are much more efficient than triggers and can boost performance. So use constraints instead of triggers, whenever possible.

Use UNION ALL statement instead of UNION, whenever possible.

The UNION ALL statement is much faster than UNION, because UNION ALL statement does not look for duplicate rows, and UNION statement does look for duplicate rows, whether or not they exist.

Try to avoid using the DISTINCT clause, whenever possible.

Because using the DISTINCT clause will result in some performance degradation, you should use this clause only when it is necessary.

Try to avoid the HAVING clause, whenever possible.

The HAVING clause is used to restrict the result set returned by the GROUP BY clause. When you use GROUP BY with the HAVING clause, the GROUP BY clause divides the rows into sets of grouped rows and aggregates their values, and then the HAVING clause eliminates undesired aggregated groups. In many cases, you can write your select statement so, that it will contain only WHERE and GROUP BY clauses without HAVING clause. This can improve the performance of your query. If you need to return the total table’s row count, you can use alternative way instead of SELECT COUNT(*) statement. There r 2 ways to do this
SELECT COUNT(*) statement makes a full table scan to return the total table’s row count, it can take very many time for the large table. There is another way to determine the total row count in a table. You can use sysindexes system table, in this case. There is ROWS column in the sysindexes table. This column contains the total row count for each table in your database.

So, you can also use the following select statement instead of

                    SELECT COUNT(*)

                    SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name')

So, you can improve the speed of such queries in several times.

Use SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement. This can reduce network traffic, because your client will not receive the message indicating the number of rows affected by a T-SQL statement.

Use the WHERE clause.

Results in good performance benefits, because SQL Server will return to client only particular rows, not all rows from the table(s). This can reduce network traffic and boost the overall performance of the query.

Use the select statements with TOP keyword or the SET ROWCOUNT statement if you need to return only the first n rows. This can improve the performance of your queries because smaller result set will be returned. This can also reduce the traffic between the server and the clients.

Return only the particular columns from the table, not all columns/ Avoid using “Select *”

Gives u good performance benefits, because SQL Server will return to client only particular columns, not all table’s columns. This can reduce network traffic and boost the overall performance of the query.

Create Index where appropriate

Table indexing will boost the performance of the queries a lot. SQL Server can perform a table scan, or it can use an index scan. When performing a table scan, SQL Server starts at the beginning of the table, goes row by row in the table, and extracts the rows that meet the criteria of the query. When SQL Server uses an index, it finds the storage location of the rows needed by the query and extracts only the needed rows.

Avoid creating indexes on small tables since it will take more time for SQL Server to perform an index scan than to perform a simple table scan. If a table has too many transactions happening in it (INSERT/ UPDATE/DELETE), keep the number of indexes minimal. For each transaction, indexes created in the table are re-organized by SQL Sever, which reduces performance.

Index Tuning Wizard, which is available in the SQL Server Enterprise Manager, is a good tool to create optimized indexes. You can find it in Tools->Wizards->Management->Index Tuning Wizard.

Avoid using Temp Tables inside stored procedures

If temporary tables are used inside stored procedures, SQL Server may not reuse the execution plan each time the stored procedure is called. So this will reduce performance.

Use table variables instead of temporary tables.

Table variables require less locking and logging resources than temporary tables, so table variables should be used whenever possible (available in SQL Server 2000 only).

Avoid Using Cursors

SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated sub-query or derived tables, if you need to perform row-by-row operations

Use Joins Appropriately

One of the best ways to boost JOIN performance is to limit how many rows need to be joined. This is especially beneficial for the outer table in a JOIN. Only return absolutely only those rows needed to be joined, and no more.

If you perform regular joins between two or more tables in your queries, performance will be optimized if each of the joined columns has their own indexes. This includes adding indexes to the columns in each table used to join the tables. Generally speaking, a clustered key is better than a non-clustered key for optimum JOIN performance.

If you have two or more tables that are frequently joined together, then the columns used for the joins on all tables should have an appropriate index. If the columns used for the joins are not naturally compact, then considering adding surrogate keys to the tables that are compact in order to reduce the size of the keys, thus decreasing read I/O during the join process, increasing overall performance.

JOIN performance has a lot to do with how many rows you can stuff in a data page. For example, let’s say you want to JOIN two tables. Most likely, one of these two tables will be smaller than the other, and SQL-Server will most likely select the smaller of the two tables to be the inner table of the JOIN. When this happens, SQL Server tries to put the relevant contents of this table into the buffer cache for faster performance. If there is not enough room to put all the relevant data into cache, then SQL Server will have to use additional resources in order to get data into and out of the cache as the JOIN is performed. If all of the data can be cached, the performance of the JOIN will be faster than if it is not. This comes back to the original statement, that the number of rows in a table can affect JOIN performance. In other words, if a table has no wasted space, it is much more likely to get all of the relevant inner table data into cache, boosting speed. The moral to this story is to try to get as much data stuffed into a data page as possible. This can be done through the use of a high fill-factor, rebuilding indexes often to get rid of empty space, and to optimize data types and widths when creating columns in tables.

Keep in mind that when you create foreign keys, an index is not automatically created at the same time. If you ever plan to join a table to the table with the foreign key, using the foreign key as the linking column, then you should consider adding an index to the foreign key column. An index on a foreign key column can substantially boost the performance of many joins.

Avoid joining tables based on columns with few unique values. If columns used for joining aren’t mostly unique, then the SQL-Server optimizer may not be able to use an existing index in order to speed up the join. Ideally, for best performance, joins should be done on columns that have unique indexes. For best join performance, the indexes on the columns being joined should ideally be numeric data types, not CHAR or VARCHAR, or other non-numeric data types. The overhead is lower and join performance is faster. For maximum performance when joining two or more tables, the indexes on the columns to be joined should have the same data type, and ideally, the same width. This also means that you shouldn’t mix non-Unicode and Unicode datatypes when using SQL Server 7.0 or later. (e.g. VARCHAR and NVARCHAR). If SQL Server has to implicitly convert the data types to perform the join, this not only slows the joining process, but it also could mean that SQL Server may not use available indexes, performing a table scan instead.

When you create joins using Transact-SQL, you can choose between two different types of syntax: either ANSI or Microsoft. ANSI refers to the ANSI standard for writing joins, and Microsoftrefers to the old Microsoft style of writing joins. For example:

ANSI JOIN Syntax

SELECT fname, lname, department
FROM names INNER JOIN departments ON names.employeeid = departments.employeeid

Former Microsoft JOIN Syntax

SELECT fname, lname, department

FROM names, departments
WHERE names.employeeid = departments.employeeid

If written correctly, either format will produce identical results. But that is a big if. The older Microsoft join syntax lends itself to mistakes because the syntax is a little less obvious. On the other hand, the ANSI syntax is very explicit and there is little chance you can make a mistake.

If you have to regularly join four or more tables to get the recordset you need, consider denormalizing the tables so that the number of joined tables is reduced. Often, by adding one or two columns from one table to another, the number of joins can be reduced, boosting performance.

If your join is slow, and currently includes hints, remove the hints to see if the optimizer can do a better job on the join optimization than you can. This is especially important if your application has been upgraded from version 6.5 to 7.0, or from 7.0 to 2000.

One of the best ways to boost JOIN performance is to ensure that the JOINed tables include an appropriate WHERE clause to minimize the number of rows that need to be JOINed.

For example, I have seen many developers perform a simple JOIN on two tables, which is not all that unusual. The problem is that each table may contain over a million rows each. Instead of just JOINing the tables, appropriate restrictive clauses needed to be added to the WHERE clause of each table in order to reduce the total number of rows to be JOINed. This simple step can really boost the performance of a JOIN of two large tables.

In the SELECT statement that creates your JOIN, don’t use an * (asterisk) to return all of the columns in both tables. This is bad form for a couple of reasons. First, you should only return those columns you need, as the less data you return, the faster your query will run. It would be rare that you would need all of the columns in all of the tables you have joined. Second, you will be returning two of each column used in your JOIN condition, which ends up returning way more data that you need, and hurting performance.

Take a look at these two queries:

USE Northwind
SELECT * FROM Orders INNER JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID

And

USE Northwind

SELECT Orders.OrderID, Orders.OrderDate, [Order Details].UnitPrice, [Order Details].Quantity, [Order Details].Discount
FROM Orders INNER JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID

Both of these queries perform essentially the same function. The problem with the first one is that it returns not only too many columns (they aren’t all needed by the application), but the OrderID column is returned twice, which doesn’t provide any useful benefits. Both of these problems contribute to unnecessary server overhead, hurting performance. The moral of this story is never to use the * in your joins.

While high index selectivity is generally an important factor that the Query Optimizer uses to determine whether or not to use an index, there is one special case where indexes with low selectivity can be useful speeding up SQL Server. This is in the case of indexes on foreign keys. Whether an index on a foreign key has either high or low selectivity, an index on a foreign key can be used by the Query Optimizer to perform a merge join on the tables in question. A merge join occurs when a row from each table is taken and then they are compared to see if they match the specified join criteria. If the tables being joined have the appropriate indexes (no matter the selectivity), a merge join can be performed, which is often much faster than a join to a table with a foreign key that does not have an index.

For very large joins, consider placing the tables to be joined in separate physical files in the same filegroup. This allows SQL Server to spawn a separate thread for each file being accessed, boosting performance.

Don’t use CROSS JOINS, unless this is the only way to accomplish your goal. What some inexperienced developers do is to join two tables using a CROSS JOIN, and then they use either the DISTINCT or the GROUP BY clauses to “clean up” the mess they have created. This, as you might imagine, can be a huge waste of SQL Server resources.

If you have the choice of using a JOIN or a subquery to perform the same task, generally the JOIN (often an OUTER JOIN) is faster. But this is not always the case. For example, if the returned data is going to be small, or if there are no indexes on the joined columns, then a subquery may indeed be faster. The only way to really know for sure is to try both methods and then look at their query plans. If this operation is run often, you should seriously consider writing the code both ways, and then select the most efficient code. Joins should be used instead of sub-selects when the sub-select contain aggregate functions.

If you have a query with many joins, one alternative to de-normalizing a table to boost performance is to use an Indexed View to pre-join the tables. An Indexed View, which is only available from SQL-Server 2000 Enterprise Edition, allows you to create a view that is actually a physical object that has its own clustered index. Whenever a base table of the Indexed View is updated, the Indexed View is also updated. As you can imagine, this can potentially reduce INSERT, UPDATE, and DELETE performance on the base tables. You will have to perform tests, comparing the pros and cons of performance in order to determine whether or not using an Indexed View to avoid joins in query is worth the extra performance cost caused by using them.

If you have a query that uses a LEFT OUTER JOIN, check it carefully to be sure that is the type of join you really want to use. As you may know, a LEFT OUTER JOIN is used to create a result set that includes all of the rows from the left table specified in the clause, not just the ones in which the joined columns match. In addition, when a row in the left table has no matching rows in the right table, the result set row contains NULL values for all the selected columns coming from the right table. If this is what you want, then use this type of join. The problem is that in the real world, a LEFT OUTER JOIN is rarely needed, and many developers use them by mistake. While you may end up with the data you want, you may also end up with more than the data you want, which contributes to unnecessary overhead and poor performance. Because of this, always closely examine why you are using a LEFT OUTER JOIN in your queries, and only use them if they are exactly what you need. Otherwise, use a JOIN that is more appropriate to your needs.

If you are having difficulty tuning the performance of a poorly performing query that has one or more JOINs, check to see if the query plan created by the query optimizer is using a hash join. When the query optimizer is asked to join two tables that don’t have appropriate indexes, it will often perform a hash join. A hash join is resource intensive (especially CPU and I/O) and can slow the performance of your join. If the query in question is run often, you should consider adding appropriate indexes. For example, if you are joining column1 in table1 to column5 in table2, then column1 in table1 and column5 in table2 need to have indexes. Once indexes are added to the appropriate columns used in the joins in your query, the query optimizer will most likely be able to use these indexes, performing a nested-loop join instead of a hash join, and performance will improve.

Some SQL-Server Performance Enhancement Tips

Leave a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta