Simple • Straightforward

So easy a Caveman
Programmer can do it.

›› Let's Get Started

Indexes

Like In A Book Right?

What are these mysterious indexes that people always seem to use to solve performance problems? The basic answer is that they are like an index on a book. You can find out what page number you need to go to quickly, instead of thumbing through the entire book looking for a particular subject. In SQL terminology, indexes let us get to the data we need a lot faster. Let's look at a basic example below.

Imagine you have a list of fruits and prices as we see in the below table:

FruitLocationPrice
OrangeDowntown$1.00
GrapefruitDowntown$1.50
WatermelonBoondocks$5.00
OrangeBoondocks$.50
OrangeUptown$3.00
AppleBoondocks$3.50
WatermelonUptown$10.00

If I needed to return all of the prices for Apples, the computer would need to read through each line to determine if it were an Apple. This would cause 6 reads to return 1 line of data. If we were to put everything in order based on the Fruit name, this might save us some time.

FruitLocationPrice
AppleDowntown$3.50
GrapefruitBoondocks$1.50
OrangeDowntown$1.00
OrangeBoondocks$.50
OrangeUptown$3.00
WatermelonBoondocks$5.00
WatermelonUptown$10.00

Now that we know everything is in order, we can do a "binary search" to speed up our searching. If we look at the middle row, it is Orange. The letter A comes before the letter O, so let's only search on the top half of our results for the Apple data sets (rows 1 - 3) and look at the middle row again. It returns Grapefruit, so let's again only look in the top half of our search (row 1), which gives us the Apple row. This would have taken 3 reads (including the one that returned the Apple row) instead of 6.

With so few rows, this would not be very noticeable. Imagine if you were dealing with a million rows, how much faster would searching utilizing an index be. Now SQL does a much more logical searching outside of "binary searches" that speed up returning results, but I just wanted to get you in the right mindset to see how beneficial indexes can be. If you are interested in how it really works, hop over to Wikipedia a look at the Balanced Tree article.



Syntax

Indexes can be added in 2 ways, either through the GUI editor (SQL Management Studio/Query Analyzer) or through SQL code. Sometimes adding indexes can take awhile if the table is large, so I tend to write them all through SQL code because I get nervous when the GUI editor hangs for awhile. But if you would like to add them in an editor like SQL Management Studio, just right click on the table and go through the options to add it.

To add a basic index on our table above (lets call the table FruitPrices) to order the Fruit column, you would write the code as follows:

CREATE INDEX (Index Name) ON (Table name) (Columns separated by commas)
CREATE INDEX IDX_Fruit ON FruitPrices (Fruit)

Now you might have heard about multi-column indexes and clustered indexes. These are very important to be aware of because they add to your toolbox of optimizations.



Multi-Column Index

A multi-column index will do pretty much what it sounds like, order the index by multiple columns. In our above example, let's say we needed to get all of the Oranges that were less than $.75. We would want our index to be written as follows.

CREATE INDEX IDX_Fruit_Prices ON FruitPrices (Fruit, Prices)

FruitLocationPrice
AppleDowntown$3.50
GrapefruitBoondocks$1.50
OrangeDowntown$.50
OrangeBoondocks$1.00
OrangeUptown$3.00
WatermelonBoondocks$5.00
WatermelonUptown$10.00

Did you notice how everything is first in order by Fruit and then in order by Price? This will let us find all of the Oranges quickly, then search through that data set to find the prices we need quickly.



Clustered Indexes

The indexes we were previously talking about do not re-arrange the physical location of the table data, but add in another data set that is organized on we specified. If we want to re-arrange our physical data, we need to create a Clustered Index. With regular indexes, you can have an indefinite amount, but with Clustered Indexes there can be only 1. When you think about it you can only arrange physical data in 1 specific way, just like the pages in a book. This allows for even faster searching, because SQL can search right off the table instead of searching through an index that points to the data we need.

Here is the syntax to add a Clustered Index:

CREATE CLUSTERED INDEX (Index Name) ON (Table name) (Columns separated by commas)
CREATE CLUSTERED INDEX IDX_FRUIT ON FruitPrices (Fruit)





Comments

Name:
Title:   
Commenting Currently Disabled


Diandra
8/3/2011 8:14:02 PM



THX that's a great aswner!
Jason
7/16/2011 7:13:02 PM
New Chapter


Hi Arto - Thanks for the suggestion. I have a couple other chapters/tests I am working on right now, but I will note that as another article that needs to be written.
Arto Ahlstedt
7/13/2011 3:55:35 PM
Covering Index & Other Included Columns


How about a chapter (in section Advanced Concepts) on adding columns to indexes as an optimization tool, with execution plans which show how a key/bookmark lookup operation from the base table is missing after adding suitable columns to a suitable index?
Jason
7/7/2011 5:11:58 PM
Binary Trees


Hi Gail - Don't be sorry, that was a completely wrong link. I was searching for the Wiki link and copied it from the wrong window. Thanks for taking the time to look over the articles here. I really appreciate it! If there is a better link outside of the Wikipedia one, I would be happy to post that instead.
Gail Shaw
7/7/2011 2:14:02 PM
Binary trees


Sorry to keep going on about this... The wiki article you linked to describes a balanced binary tree. Index structures are not binary trees of any form. They are balanced trees (b+ trees if you want to get technical) Much better reference(if you insist on wikipedia) is http://en.wikipedia.org/wiki/B%2B_tree
Jason
7/5/2011 1:04:05 PM
Binary Searches


Hi Gail - Thanks for the comments. If it is not clear to one person, then I figure it is not clear to a lot of people. I have updated the section to include some bolding and a link to Wikipedia's article about Balanced Tree searches.
Gail Shaw
7/5/2011 5:35:08 AM
Binary Search


The problem is you're giving any reader a completely wrong impression as to how SQL uses indexes. There are ways to simplify things without stating something that is completely false.
Jason
7/4/2011 10:52:09 AM
How Indexes Search


Hi Gail, I agree that SQL does do much more advanced searching, but that is a bit hard to understand when you are first learning SQL. The binary search example is a bit more straightforward for someone new to optimizing. I want to make sure that anyone can understand that there are faster ways to search than scanning through tables line by line.
Gail Shaw
7/3/2011 6:27:00 PM



One main comment. SQL doesn't do binary searches for data when it has an index. It has a balanced tree structure (not a binary tree), starts at the root and walks down to the leaf. Because the fanout is so much higher than it would be for a binary tree, the search is more efficient than a binary search would be. See my article on indexes over at SQLServerCentral: http://www.sqlservercentral.com/articles/Indexing/68439/