Data and AI Knowledge: SQL Server ColumnStore Index

SQL Server Column Store Index

What is ColumnStore Index?
This is new type of Index.Introduced in SQL Server 2012.
New Query Processing Mode- Process Query in Batches.
Accelerates Workload- SQL Server relational data warehouse quires up to 10s of TB.

In simple words, ColumnStore Index store data column wise, each page stores data in a single column.Rather than Heap, B-tree store data row-wise.
Example of Row and Column Storage:

Columnstore indexes in the SQL Server Database Engine can be used to significantly speed-up the processing time of common data warehousing queries. Typical data warehousing workloads involve summarizing large amounts of data. The techniques typically used in data warehousing and decision support systems to improve performance are pre-computed summary tables, indexed views, OLAP cubes, and so on. Although these can greatly speed up query processing, these techniques can be inflexible, difficult to maintain, and must be designed specifically for each query problem.

The key characteristics of SQL Server column store technology are as follows:
1. Columnar data format – Unlike the traditional row based organization of data (called rowstore format), in columnar database systems such as SQL Server with columnstore indexes, data is grouped and stored one column at a time. SQL Server query processing can take advantage of the new data layout and significantly improve query execution time.
2. Faster query results – Columnstore indexes can produce faster results for the following reasons:
-Only the columns needed must be read. Therefore, less data is read from disk to memory and later moved from memory to processor cache.
-Columns are heavily compressed. This reduces the number of bytes that must be read and moved.
-Most queries do not touch all columns of the table. Therefore, many columns will never be brought into memory. This, combined with excellent compression, improves buffer pool usage, which reduces total I/O.
-Advanced query execution technology processes chunks of columns called batches in a streamlined manner, reducing CPU usage.
3.Key columns – There is no concept of key columns in a columnstore index so the limitation on the number of key columns in an index (16) does not apply to columnstore indexes.
4. Clustered index key – If a base table is a clustered index, all columns in the clustering key must be present in the nonclustered columnstore index. If a column in the clustering key is not listed in the create index statement, it will be added to the columnstore index automatically.
5. Partitioning – Columnstore indexes works with table partitioning. No change to the table partitioning syntax is required. A columnstore index on a partitioned table must be partition-aligned with the base table. Therefore a nonclustered columnstore index can only be created on a partitioned table if the partitioning column is one of the columns in the columnstore index.
6. Record Size – The index key record size limitation of 900 bytes also does not apply to columnstore indexes.
7. Query processing – Along with the columnstore index, SQL Server introduces batch processing to take advantage of the columnar orientation of the data. The columnstore structure and batch processing both contribute to performance gains, but investigating performance issues might be more complex than if only one factor were involved.
8. Table cannot be updated – For SQL Server 2012, a table with a columnstore index cannot be updated.

Data and AI Knowledge

Tiles

Tuesday, 4 August 2015

SQL Server ColumnStore Index

1 comment: