Wednesday, 10 September 2014

Index Types

INDEX TYPES

Clustered Index
An index that imposes a sort order on the pages within the index.
A table can have only one clustered index.

Non clustered Index
An index that does not impose a sort order on the data pages
within the table. You can have up to 1,000 non clustered indexes in a table.

INDEX KEY
You can define an index by using one or more columns in the table, called the index key, with the following restrictions:
  • You can define an index with a maximum of 16 columns.
  • The maximum size of the index key is 900 bytes.

CLUSTERING KEY
The column(s) defined for  clustered index are referred to as clustering key.

CLUSTERED INDEX
    A clustered index is special because it causes SQL Server to arrange the data in the table according to the clustering key. Because a table cannot be sorted more than one way, you can define only one clustered index on a table.
       Clustered indexes provide a sort order for the storage of data within a table. However, a clustered index does not provide a physical sort order. A clustered index does not physically store the data on disk in a sorted order because doing so creates a large amount of disk input/output (I/O) for page split operations. Instead, a clustered index ensures that the page chain of the index is sorted logically, allowing SQL Server to traverse directly down the page chain to
locate data. As SQL Server traverses the clustered index page chain, each row of data is read in clustering key order.

The general syntax for creating a relational index is as follows:

CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
ON <object> ( column [ ASC | DESC ] [ ,...n ] )
[ INCLUDE ( column_name [ ,...n ] ) ]
[ WHERE <filter_predicate> ]
[ WITH ( <relational_index_option> [ ,...n ] ) ]
[ ON { partition_scheme_name ( column_name ) | filegroup_name | default } ]
[ FILESTREAM_ON { filestream_filegroup_name | partition_scheme_name | "NULL" } ][ ;

Nonclustered Indexes
The other type of relational index that you can create is a nonclustered index. Nonclustered indexes do not impose a sort order on the table, so you can create multiple nonclustered indexes on a table. Nonclustered indexes have the same restrictions as a clustered index.

RESTRICTIONS OF NONCLUSTERED INDEXES
They can have a maximum of 900 bytes in the index key and a maximum of 16 columns.
A table is limited to a maximum of 1,000 nonclustered indexes.


Difference Between Clustered And  Nonclustered Index
A clustered index imposes a sort order on the data pages in the table. 
A nonclustered index does not impose a sort order.


Covering Indexes
When an index is built, every value in the index key is loaded into the index. In effect, each index is a mini-table containing all the values corresponding to just the columns in the index key. Therefore, it is possible for a query to be entirely satisfied by using the data in the index.
An index that is constructed such that SQL Server can completely satisfy queries by reading only the index is called a covering index.


Included Columns
Indexes can be created using the optional INCLUDE clause. Included columns become part of the index at only the leaf level. 
Values from included columns do not appear in the root or intermediate levels of an index and do not count against the 900-byte limit for an index.
Therefore, you can construct covering indexes that can have more than 16 columns and 900 bytes by using the INCLUDE clause.

Filtered Indexes
      To handle the cases where significant skew exists in the data, SQL Server 2008 allows you to create filtered indexes. 
A filtered index is simply an index with a WHERE clause. Only the index keys matching the WHERE clause are added to the index, allowing you to build indexes that focus on the highly selective portions of the table while allowing SQL Server to choose another method for the less selective range.
 Filtered indexes have the following restrictions:
  • They must be a nonclustered index.
  • They cannot be created on computed columns.
  • Columns cannot undergo implicit or explicit data type conversion.

Index Management and Maintenance


Index Management and Maintenance

Because the data within an index is stored in sorted order, over time, values can move around within the index due to either page splits or changes in the values. To manage the fragmentation of an index over time, you need to perform periodic maintenance.

Index Fragmentation
            Files on an operating system can become fragmented over time due to repeated write operations. Although indexes can become fragmented, index fragmentation is a bit different from file fragmentation.
               When an index is built, all the values from the index key are written in sorted order onto pages within the index. If a row is removed from the table, SQL Server needs to remove the corresponding entry from the index. 
  • The removal of the value creates a “hole” on the index page. SQL Server does not reclaim the space left behind because the cost of finding and reusing a hole in an index is prohibitive.
  • If a value in the table that an index is based on changes, SQL Server must move the index entry to the appropriate location, which leaves behind another hole. 
When index pages fill up and require a page split, you get additional fragmentation of the index. Over time, a table that is undergoing large amounts of data changes has the indexes become fragmented.
               
To control the rate of fragmentation of an index: 
  • Use an index option called the fill factor. 
  • Use ALTER INDEX statement to remove the fragmentation.

FILLFACTOR
          The FILLFACTOR option for an index determines the percentage of free space that is reserved on each leaf-level page of the index when an index is created or rebuilt. The free space reserved leaves room on the page for additional values to be added, thereby reducing the rate at which page splits occur. 
The FILLFACTOR is represented as a percentage full. For example, a fill factor of 75 means that 25 percent of the space on each leaf-level page is left empty to accommodate future values.

Defragmenting an Index

ALTER INDEX { index_name | ALL }
ON <object>
{ REBUILD
[ [ WITH ( <rebuild_index_option> [ ,...n ] ) ]
| [ PARTITION = partition_number
[ WITH ( <single_partition_rebuild_index_option>
[ ,...n ] )] ] ]
| DISABLE | REORGANIZE
[ PARTITION = partition_number ]
[ WITH ( LOB_COMPACTION = { ON | OFF } ) ]
| SET ( <set_index_option> [ ,...n ] ) }[ ; ]
  
 You can defragment indexes using either the REBUILD or REORGANIZE options.

REBUILD
           The REBUILD option defragments all levels of an index. Unless the ONLINE option is specified, a REBUILD acquires a shared table lock and block any data modifications.

REORGANIZE
            The REORGANIZE option defragments only the leaf level of an index and does not cause blocking.

Disabling an index
   An index can be disabled by using the ALTER INDEX statement as follows:

ALTER INDEX { index_name | ALL }
ON <object>
DISABLE [ ; ]

      To enable an index, it must be rebuilt to regenerate and populate the B-tree structure. You can accomplish this by using the following command:

ALTER INDEX { index_name | ALL }
ON <object>
REBUILD [ ; ]

Pages and Extents


Pages and Extents

The fundamental unit of data storage in SQL Server is the page.  Disk I/O operations are performed at the page level. SQL Server reads or writes whole data pages.

Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. 

Pages 

In SQL Server, the page size is 8 KB. Each page has number, starting from 0; the number of last page in database is determined by database file size.

1 Megabyte = 128 Pages

Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page.  

enter image description here

  • Page header stores information about the page like page type, next and previous page if it’s an index page, free space in the page etc.
  • After the page header data row section follows. This is where you data is actually stored.
  • Row offset information is stored at the end of the page i.e. after the data row section. Every data row has a row offset and the size of row offset is 2 bytes per row. Row offset stores information about how far the row is from the start of the page.
Putting in simple words the complete page equation comes as shown below.
Page (8192 bytes) = Page header(96 bytes) + Actual data(8060 bytes) + Row offset (32 bytes ).

 

 Youtube video which  demonstrates how 8 KB pages 


Extents
 

Extents are the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte. 

To make its space allocation efficient, SQL Server does not allocate whole extents to tables with small amounts of data. 

SQL Server has two types of extents:

  • Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object.
  • Mixed extents are shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object.
A new table or index is generally allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it then switches to use uniform extents for subsequent allocations. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.











Mixed and uniform extents


REFERENCES:

WEBSITES

Page In SQL Server on StackOverFlow

SQL Server Storage Engine: Allocation Maps 

[TOADWORLD] Data Pages and Extents 






BOOKS: 

Exam 70-432 Microsoft SQL Server 2008 Implementation and Maintenance