When a bug jumps out and surprises me I like to share it so others do not run into the same unexpected result. I don’t think there is any reason for a bug to bite multiple people in the butt. Therefore, I want to show you why persist sample percent IS NOT PERSISTED!
Why should I use Persist Sample Percent?
When your table grows and the rows multiply the default statistics sample percent used by SQL Server gets smaller and smaller. In theory, persist sample percent lets you update your statistic once specifying the percent it should use going forward when a sample percent is not specified. Unfortunately, this feature is broken, IMHO.
Persist Sample Percent Setup
To set up the scene so you can reproduce and learn. Below we create a single-column table that is an identity and also primary key. Therefore, an index is created which also will create statistics on our column.
DROP TABLE dbo.Test CREATE TABLE dbo.Test (ID INT IDENTITY NOT NULL CONSTRAINT TestPK PRIMARY KEY); INSERT INTO dbo.Test DEFAULT VALUES GO 10000000 --100 million rows.. CREATE PROCEDURE dbo.DemoStatsReview AS BEGIN SELECT CAST(((rows_sampled * 1.00)/ [rows] )*100.00 AS NUMERIC(5,2)) AS SamplePCT, sp.* FROM sys.stats AS s OUTER APPLY sys.dm_db_stats_properties (s.[object_id], s.stats_id) AS sp WHERE s.[name] = N'TestPK'; END
First, we will rebuild our index utilizing a fullscan. This is expected and normal activity for an index rebuild.
/* Index Rebuild uses 100% rows for sampling */ ALTER INDEX TestPK ON dbo.Test REBUILD WITH (STATISTICS_NORECOMPUTE = OFF) EXEC dbo.DemoStatsReview
Now, we will update statistics utilizing the new Persist Sample Percent feature. This should give us two benefits. One, auto stats updates will use this sample rate going forward on this statistic. Two, we no longer need to supply an sample percent if we update statistics manually or with our maintenance jobs.
/* Now lets update stats by using the PERSIST_SAMPLE_PERCENT */ UPDATE STATISTICS dbo.Test TestPK WITH SAMPLE 60 PERCENT, PERSIST_SAMPLE_PERCENT = ON; EXEC dbo.DemoStatsReview
Let’s go ahead and update statistics now without any sample percent specified. We will see that the persist sample percent is applied as expected.
/* Update stats to validate sample size is persisted */ UPDATE STATISTICS dbo.Test TestPK EXEC dbo.DemoStatsReview
Let’s see what happens when we rebuild the index. We expect that a FULLSCAN is used to update the statistics behind the index. Did anything else change? OH THE SUSPENSE!
/* What happens if we rebuild an index that has its stats persisted? Do we still use 100% rows for sample? */ ALTER INDEX TestPK ON dbo.Test REBUILD WITH (STATISTICS_NORECOMPUTE = OFF) EXEC dbo.DemoStatsReview
/* Lets update Stats again. Remember presisted a sample size is 60%. */ UPDATE STATISTICS dbo.Test TestPK EXEC dbo.DemoStatsReview
There you have it. The persist sample percent not only went away on the Index Rebuild but because we updated statistics without forcing a sample percent on 100 million rows the sample percent went to 1%. I will add another blog post that focuses on this later. For now, if the late Notorious BIG was a DBA he would say, “More rows, more problems with stats you get!” If this didn’t make any sense. The more row the lower there sample rate when statistics get updated.
How do we fix this?
This is a bug inside of SQL Server. There is a feedback item that hasn’t received any feedback from Microsoft in two years since the bug was reported. Please upvote so this can get the focus of Microsoft so persist sample percent is actually persisted!