Heaps get a bad rap but is it justified? Most of the time, I’d say the answer is “yes.” Let’s dive into some heap scenarios.
What is a Heap?
A table is considered a heap when it is created without a clustered index. Data isn’t in any type of ordered state. Some data is over here, some data is over there.
When you are inserting data into a heap, that data is tossed in wherever. Think of it like your junk drawer. It’s not organized into its own little sections. What do you do when you have something to add such as a pair of scissors or an old pen? You open the drawer, toss it in, and close it up without giving it a second thought.
That’s not necessarily a bad thing. If you’re inserting data into staging tables, this can be faster compared to inserting into a clustered index. When you need to frequently track that data down and dig through the junk drawer is when you may run into issues.
Check for Heaps
You can run the following script against sys.indexes to check for heaps in a database:
SELECT OBJECT_NAME(object_id) as 'table', type_desc
FROM sys.indexes
WHERE type_desc = 'HEAP';
GO
A Quick Fix?
You’ve done some testing in a non-prod environment and decided that you have a heap that would be better suited as a table with a clustered index. You can just write that CREATE INDEX statement, execute, and be good to go, right? Maybe…maybe not.
SQL Server is going to have to read the whole heap of data, sort it into the right order based on how you’re defining the clustered index, and write it back. Nonclustered indexes on the table are also going to be rebuilt.
If you’re performing this operation on an enormous table, that process could utilize a significant amount of your servers resources and there could be a bit of pain involved. I know “enormous” is relative so what’s slow on one server might be fine on another. Just don’t assume every clustered index created on a heap will take seconds. Do the right thing and thoroughly test so that you don’t create more problems while the clustered index is being created.
Make sure to test and monitor after adding the clustered index to make sure that change is helping performance.
Thanks for reading!
I wrote about this topic a while ago and provided a very helpful script that guestimates recommended clustered index for each table quite well, based on the metadata and statistics available within SQL server:
https://eitanblumin.com/2019/12/30/resolving-tables-without-clustered-indexes-heaps/
LikeLike