Explore Column Stored Index

What is a column store index?

A columnstore index is a type of data structure that’s used to store, manage and retrieve data that is stored in a columnar-style database. CSI01

Why Column Store Index is used ?

Best for queries that scan/aggregate large sets of data.
IO Statistics - dramatically reduces # of logical reads.
Smart IO and caching using aggressive read-ahead read strategy.
In a regular index, indexed data from each row kept together on single page – and the data in each column spread across all pages of index
Best for data warehouse/mart queries that scan/aggregate large amounts of data–might lower need for OLAP aggregation
In a regular index, indexed data from each row kept together on single page – and the data in each column spread across all pages of index
More and more functionality in DB engine (xVelocity, CDC)
Some queries might run at least 10x faster (or more)
Reduced storage costs and enhance performance.
In Data Warehouses/Data Marts (in a star schema) ,queries performance is gained
Load from Data Warehouses/Marts into OLAP Cubes
SSAS OLAP Databases that use the ROLAP methodology or pass-through mode “might” benefit (more so in SQL 2014)
New Analysis Services Tabular Model uses xVelocity engine

How to Create a column store index?

Expand the table. Under Indexes, right click and select New Index and then Clustered Column store Index as shown below :

Using SQL Server Management Studio :

[gallery ids=”1195,1196” type=”rectangular”]

Using T-SQL

CREATE CLUSTERED COLUMNSTORE INDEX [IXProductID] ON [dbo].[Product]

WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0, DATA_COMPRESSION = COLUMNSTORE) 
ON [PRIMARY]

GO

Difference Between Column store vs. Row store index

Row store :

Row store data is logically organized by rows and columns, and is physically stored in row-oriented data pages.
Row store indexes perform best on queries that seek data by searching for a particular value or retrieving a small range of values.

Column store :

The column store index is also logically organized as a table with rows and columns, but the data is physically stored in a column-wise data format.
Column store indexes work well for mostly read-only queries with large data sets, like data warehousing workloads.

CSI04

Row Store Index	Column Store Index
Row store indexes tend to be better for online transaction processing (OLTP) workloads, which use more update and seek operations.	Column store indexes tend to be better for online analytical processing (OLAP) workloads, which use more read operations.
Row store indexes tend to be better at performing random reads and writes.	Column store indexes tend to be better for performing sequential reads and writes.
Row store data is logically organized by rows and columns, and is physically stored in row-oriented data pages.	The column store index is also logically organized as a table with rows and columns, but the data is physically stored in a column-wise data format.

Limitations of Column Store Index

Cannot be clustered, cannot be created against a view
Cannot act as a PK or FK, cannot include sparse columns
Can’t work on tables with Change Data Capture/Change Tracking or FileStream, can’t participate in replication, nor when page/row compression exists
Cannot be used with certain data types, such as binary, text/image, rowversion/timestamp, CLR data types (hierarchyID/spatial), nor with data types created with MAX keyword…e.g. varchar(max)
Cannot be used with Unique Identifier
Cannot be used with decimal > 18
Cannot be modified with ALTER – must be dropped and recreated
Not optimized for certain statements (OUTER JOIN, UNION, NOT IN <subquery>)
Not optimized for certain scenarios (high selectivity, queries lacking any aggregations)
Issue OUTER JOIN: can’t use directly against table
Will “work”, but will use slower row execution mode
Must pre-aggregate separately and then do OUTER JOIN