Tag Archives: Query Store

Query Store for Workload Replays

Query Store for Workload Replays
Query Store for Workload Replays

UPDATE: 3/16/2020
The first script below was updated from setting QDS to “Read_Write” instead of “Read_Only”. Martin thank you for pointing this out in your comment below.

This month’s T-SQL Tuesday is hosted by Tracy Boggiano. Tracy invites us all to write about adopting Query Store. Today, I wanted to share my favorite but a very unique way I use the Query Store for Workload Replays.

You can read more about the invite in detail by clicking on the T-SQL Tuesday logo in this post.

Today, I wanted to talk about my least favorite part of replaying workloads. It’s having an extended event or server-side trace running during a workload replay only so we can compare the results at a query-level when the replay is finished. Now, this might seem like a trivial thing but when you have workloads over 10k batch requests/sec this can consume terabytes of data quickly. The worst part is waiting to read all the data, slice and dice the data for analysis.

Starting with SQL Server 2016 there is a better and faster way to go! You can replace your extended event or server-side trace with Query Store captured data. Today, I will show you how to use the Query Store for the same purpose.

Different Settings

Keep in mind our goal here is very different from the typical use case for using the Query Store. We want to capture metrics for all the queries executed during a workload replay. Nothing more and nothing less.

If we have the runtime results for multiple replays we can then easily compare the workload performance between the workload replays.

Most of our changes from the regular Query Store best practices are shown below:

  • Max Size (MB) – Need to make sure there is enough space to capture your whole workload. This size will vary by how much workload is being replayed.
  • Query Store Capture Mode set to All. Normally, not ideal, but remember we want to capture metrics for our whole workload being replayed.
  • Size Based Cleanup Mode set Off – Yup, we don’t want to lose our workload data that is capture until we persist in our ideal form. More on this later.

The Capture Process

Now, this is where you would use Database Experimentation Assistant (DEA), Distributed Replay or some other process to replay your consistent workload in an isolated non-production environment. This subject we will cover in another future post. For now, we will just have two replays called “Baseline” and “Change”. This simulates a baseline replay with no schema changes and then another change replay with a change introduced in the schema.

To capture our workload we just enable the Query store with our settings mentioned above and also clear out the query store right before our workload replay starts to help ensure we are just capturing our workload.

USE [master]
ALTER DATABASE [YourDatabase] 

Stop Capturing Query Store Data

Once your replay is finished we will want to disable the query store from writing data into the query store. We want the least amount of non-workload data inside of the Query Store when we are using it for the sole purpose of comparing workloads.

USE [master]
ALTER DATABASE [YourDatabase] 

Prepare Query Store Data for Long-Term Analysis

Now for smaller workloads, one might be happy with utilizing DBCC CLONEDATABASE to have a schema-copy of their workload with Query Store data persisted. This is perfectly fine. With bigger workloads being captured I have noticed there are ways to improve the performance of query store when doing analysis of the query store data. For example, clustered columnstore indexes can be very helpful for performance and compacity. Therefore, I like to have a schema for each replay and import the data. The following is a quick example of setting up a schema for a “baseline” replay and a “change” replay.

use [DBA]

Next, we will import our captured data from our baseline replay that’s in our read-only query store database. I also like to have a baked-in aggregate of metrics for reads, writes, duration and CPU at the query level.

use [YourDatabase]
/* Load Data */
SELECT * INTO DBA.Baseline.query_store_runtime_stats 
FROM sys.query_store_runtime_stats;

SELECT * INTO DBA.Baseline.query_store_runtime_stats_interval 
from sys.query_store_runtime_stats_interval;

select * INTO DBA.Baseline.query_store_plan 
from sys.query_store_plan;

select * INTO DBA.Baseline.query_store_query
from sys.query_store_query;

select * INTO DBA.Baseline.query_store_query_text
from sys.query_store_query_text;
/* Addition for SQL 2017 */
select * INTO DBA.Baseline.query_store_wait_stats
 from sys.query_store_wait_stats

use [DBA]

SELECT SUM(Count_executions) AS TotalExecutions,
SUM(Count_executions*avg_duration) AS TotalDuration,
SUM(Count_executions*avg_logical_io_reads) AS TotalReads,
SUM(Count_executions*avg_logical_io_writes) AS TotalWrites,
SUM(count_executions*avg_cpu_time) AS TotalCPU,
INTO Baseline.QueryResults
FROM Baseline.query_store_runtime_stats rs
JOIN Baseline.query_store_plan p ON rs.plan_id = p.plan_id
JOIN Baseline.query_store_query q ON p.query_id = q.query_id
GROUP BY q.query_hash

Next, we would reset the database to our starting position and add our query store settings as mentioned above in this blog post and replay or workload again. This time, we would dump our data into the “change” schema

use [YourDatabase]
/* Load Data */
SELECT * INTO DBA.Compare.query_store_runtime_stats 
FROM sys.query_store_runtime_stats;

SELECT * INTO DBA.Compare.query_store_runtime_stats_interval 
from sys.query_store_runtime_stats_interval;

select * INTO DBA.Compare.query_store_plan 
from sys.query_store_plan;

select * INTO DBA.Compare.query_store_query
from sys.query_store_query;

select * INTO DBA.Compare.query_store_query_text
from sys.query_store_query_text;

select * INTO DBA.Compare.query_store_wait_stats
 from sys.query_store_wait_stats

use [DBA]

SELECT SUM(Count_executions) AS TotalExecutions,
SUM(Count_executions*avg_duration) AS TotalDuration,
SUM(Count_executions*avg_logical_io_reads) AS TotalReads,
SUM(Count_executions*avg_logical_io_writes) AS TotalWrites,
SUM(count_executions*avg_cpu_time) AS TotalCPU,
INTO Compare.QueryResults
FROM Compare.query_store_runtime_stats rs
JOIN Compare.query_store_plan p ON rs.plan_id = p.plan_id
JOIN Compare.query_store_query q ON p.query_id = q.query_id
GROUP BY q.query_hash

Comparing Workload Results

Now that we have our two workloads imported we can now compare to see how the workload changed per query. I will break this down into two quick steps. First, get deltas per query. Second, get totals for how many times a query might be different in the query store. More on this a little later in the post.

/* Query Store Results */
use [DBA]

c.TotalExecutions - b.TotalExecutions AS ExecutionDelta,
c.TotalExecutions AS CompareExecutions,
b.TotalExecutions AS BaselineExecutions,
c.TotalDuration - b.TotalDuration AS DurationDelta,
c.TotalCPU - b.TotalCPU AS CPUDelta,
c.TotalReads - b.TotalReads AS ReadDelta,
c.TotalWrites - b.TotalWrites AS WriteDelta,
c.TotalReads AS CompareReads,
b.TotalReads AS BaselineReads,
c.TotalCPU AS CompareCPU,
b.TotalCPU AS BaselineCPU,
c.TotalDuration AS CompareDuration,
b.TotalDuration AS BaselineDuration,
FROM Baseline.QueryResults b
JOIN Compare.QueryResults c ON b.query_hash = c.query_hash

select COUNT(query_sql_text) AS QueryCount, 
MAX(query_sql_text) query_sql_text, 
MIN(query_id) MinQueryID, 
INTO #Compare
from Compare.query_store_query qsq
JOIN Compare.query_store_query_text q 
	ON qsq.query_text_id = q.query_text_id 
where qsq.is_internal_query = 0
GROUP BY query_hash 

select COUNT(query_sql_text) AS QueryCount, 
MAX(query_sql_text) query_sql_text, 
MIN(query_id) MinQueryID, 
INTO #Baseline
from Baseline.query_store_query qsq
JOIN Baseline.query_store_query_text q 
	ON qsq.query_text_id = q.query_text_id 
where qsq.is_internal_query = 0
GROUP BY query_hash 

select cte.*
, a.QueryCount AS Compare_QueryCount
, b.QueryCount AS Baseline_QueryCount
, a.MinQueryID AS Compare_MinQueryID
, b.MinQueryID AS Baseline_MinQueryID
, a.query_sql_text
FROM #CTE cte 
JOIN #Compare a on cte.query_hash = a.query_hash
JOIN #Baseline b on cte.query_hash = b.query_hash
AND ExecutionDelta = 0

Query Store for Workload Replays

Query Store for Workload Replay gives you performance metrics to the query level.
Workload Replays compared down to the query execution level is priceless!

Lessons Learned Along the Way!

Initially, working with the query store I thought query_id was going to be my best friend. I quickly learned that my old friend query_hash is more helpful for multiple reasons. One, I can easily compare queries between different replays. That’s right now all workload replays get you the same query_id even when the workload is the exact same being replayed. Two, I can compare them with different databases as well. Finally, query_hash is very helpful with ad-hoc workloads as I can aggregate all the different query_ids that have the same query hash.

Need Performance Tuning Help?

Let us help you! Schedule a free 30-minute chat to see if we can point you the right direction. If you liked this blog post signup for our free newsletter! Also, take advantage of our free training videos!.

SQL Server Automatic Tuning in the Real-World

In SQL Server 2016 we saw Query Store.  Query Store was a game changer to help database administrators identify troublesome queries. Query Store helps DBAs make those queries run faster.  Microsoft’s marketing team even jumped on to help coin the phrase, “SQL Server It Just Runs Faster.” With SQL Server 2017, this started to get even better with automatic tuning. Don’t worry database administrators.  Automatic Tuning will just enhance your career and not replace it.

SQL Server 2017 Automatic Tuning looks for queries where execution plans change and performance regresses. This feature depends on Query Store being enabled. Note, even if you don’t turn on Automatic Tuning you still get the benefits of having access to the data. That is right. Automatic Tuning would tell you what it would do if it was enabled.  Think of this as free performance tuning training.  Go look at the DMVs and try to understand why the optimizer would want to lock in an execution plan. We will actually go through a real-world example:

Automatic Tuning with SQL Server 2017

First, let’s take a quick look at the output of the data. You can find the query and results we will focus on below.

SELECT reason, score,
      script = JSON_VALUE(details, '$.implementationDetails.script'),
      estimated_gain = (regressedPlanExecutionCount+recommendedPlanExecutionCount)
      error_prone = IIF(regressedPlanErrorCount>recommendedPlanErrorCount, 'YES','NO')
--INTO DBA.Compare.Tunning_Recommendations
FROM sys.dm_db_tuning_recommendations
  CROSS APPLY OPENJSON (Details, '$.planForceDetails')
    WITH (  [query_id] int '$.queryId',
            [current plan_id] int '$.regressedPlanId',
            [recommended plan_id] int '$.recommendedPlanId',

            regressedPlanErrorCount int,
            recommendedPlanErrorCount int,

            regressedPlanExecutionCount int,
            regressedPlanCpuTimeAverage float,
            recommendedPlanExecutionCount int,
            recommendedPlanCpuTimeAverage float

          ) as planForceDetails;

I will break the results down into two photos to make them fit well in this blog post.

Free Tuning Recommendations with Automatic Tuning in SQL Server 2017

Free Tuning Recommendations with SQL Server 2017 (1/2)

Automatic Tuning Results in SQL Server 2017

Free Tuning Recommendations with SQL Server 2017 (2/2)

Now we know in the query store query_id 2271 has CPU time changing from 7,235ms to 26ms. That’s a big difference. Let’s take that query and look at its findings by using the tracked query report inside SSMS.

Query Store find history of a query

Find my Changed Query. Did the plans change?

Here we can see the major difference between the two execution plans. One is averaging over 14 seconds in duration while the other is under a second.

Reviewing Query performance in Query Store

Query Store showing the performance difference between the two plans

Now we can select both plans on the screen above and look at the execution plans side by side inside of SSMS. When doing so, we see the common example of the optimizer determining if it is better to scan an index vs a seek with a key lookup.

Side by Side Execution Plan Review in SSMS.

Using SSMS to compare auto tuning recommended query.

To complete the example I want to point out that automatic tuning would lock in the index seek plan (Plan 2392). In SQL Server 2016 you can do this as well manually inside Query Store. With SQL Server 2017 it can be done automatically for you with Automatic Tuning. If you have ever woken up to slow performance due to an execution plan changing and performance going through the drain this might be a life saver.

If you would like to learn about performance tuning changes in SQL Server 2016 and 2017 sign up for our newsletter or catch me talking about these features at SQL Saturday Denver and SQL Saturday Pittsburgh this month.  If you need any help with tuning or upgrading contact us. We would love to chat with you!


Calculating DTU in Azure SQL Database

A few months ago, I posted a question over on ask.sqlservercentral.com.  In a nutshell, it was how do you measure DTU? How could you pull the data shown in the Azure Portal graphs with T-SQL?  No one answered, so this motivated me get off my butt and answer my question.  In doing so, I wanted to share the results with all my readers as well.

I started thinking like a detective.  If I was DTU how would I want to be found with T-SQL?

It was the Azure Portal in the Query Store

It was the Azure Portal in the Query Store

Knowing, that both “sys.dm_db_resource_stats” and “sys.resource_stats” holds data that is used to calculate DTU. I decided to leverage Query Store on an Azure SQL Database to see if I could quickly see how DTU is calculated. Behold, I was right.


Query Store

How DTU is calculated in Azure Portal

The Secret Sauce

The whole query is below. Right now, let’s just focus on the secret sauce. The secret sauce is how DTU percentage gets calculated.  In a nutshell, the maximum of CPU, Data IO, Log Write Percent determine your DTU percentage.  What does this mean to you? Your max consumer limits you. So, you can be using 1% of your IO but still be slowed down because CPU could be your max consumer resource.

(SELECT MAX(v) FROM (VALUES (avg_cpu_percent), (avg_data_io_percent), (avg_log_write_percent)) AS value(v)), 0) as dtu_consumption_percent

DTU Calculating Query

SELECT ((CONVERT(BIGINT, DATEDIFF(day, 0, [end_time])) * 24 * 3600 + DATEDIFF(second, DATEADD(day, DATEDIFF(day, 0, [end_time]), 0), [end_time])) / @timeGrain) * @timeGrain as start_time_interval
                , MAX(cpu_percent) as cpu_percent
                , MAX(physical_data_read_percent) as physical_data_read_percent
                , MAX(log_write_percent) as log_write_percent
                , MAX(memory_usage_percent) as memory_usage_percent
                , MAX(xtp_storage_percent) as xtp_storage_percent
                , MAX(dtu_consumption_percent) as dtu_consumption_percent
                , MAX(workers_percent) as workers_percent
                , MAX(sessions_percent) as sessions_percent
                , MAX(dtu_limit) as dtu_limit
                , MAX(dtu_used) as dtu_used
                    , ISNULL(avg_cpu_percent, 0) as cpu_percent
                    , ISNULL(avg_data_io_percent, 0) as physical_data_read_percent
                    , ISNULL(avg_log_write_percent, 0) as log_write_percent
                    , ISNULL(avg_memory_usage_percent, 0) as [memory_usage_percent]
                    , ISNULL(xtp_storage_percent, 0) as xtp_storage_percent
                    , ISNULL((SELECT MAX(v) FROM (VALUES (avg_cpu_percent), (avg_data_io_percent), (avg_log_write_percent)) AS value(v)), 0) as dtu_consumption_percent
                    , ISNULL(max_worker_percent, 0) as workers_percent
                    , ISNULL(max_session_percent, 0) as sessions_percent
                    , ISNULL(dtu_limit, 0) as dtu_limit
                    , ISNULL(dtu_limit, 0) * ISNULL((SELECT MAX(v) FROM (VALUES (avg_cpu_percent), (avg_data_io_percent), (avg_log_write_percent)) AS value(v)), 0) / 100.0 as dtu_used
                 FROM sys.dm_db_resource_stats 
                 WHERE [end_time] >= @startTime AND [end_time] <= @endTime
                 ) t
             GROUP BY ((CONVERT(BIGINT, DATEDIFF(day, 0, [end_time])) * 24 * 3600 + DATEDIFF(second, DATEADD(day, DATEDIFF(day, 0, [end_time]), 0), [end_time])) / @timeGrain) * @timeGrain

A Future Blog Post

Now that we can calculate DTU we could trend this data and automatically make changes like moving up and down from the current performance tier.  I would love to wrap this into an Azure SQL Database Alert but a process in PowerShell might be a great starting point.

John Sterrett is a Microsoft Data Platform MVP and a Group Principal for Procure SQL. If you need any help with your on-premise or cloud SQL Server databases, he would love to chat with you. You can contact him directly at john AT ProcureSQL dot com or here.

My SQL Saturday Houston Story..

SQL Saturday Houston was Awesome!

SQL Saturday Houston was Awesome!

This past weekend I had the privilege of speaking at SQL Saturday Houston. I enjoyed catching up with some friends while I also took the time to make some new ones as well.

The Talks

This year I gave two different sessions.  The first one in the morning was Azure Databases for DBA’s.  The second one was Why Did My Plan Change, Intro to Query Store. 

This was the first time I presented my Azure Databases for DBA’s talk at a conference. I got some great feedback and it was a lot of fun. I ended up using videos for the demos to make sure they would fit within the timeline of the presentation. Running the High Availability and Disaster Recovery Virtual Chapter I am always thinking of backup strategies this even includes my own demos  😉

At the end of the day I gave my “Why Did My Plan Change: Intro to Query Store” presentation. I was able to tell some real-world stories about lessons learned while monitoring plan changes and explain how upgrades can greatly improve performance when you find the few queries that run slower after an upgrade and fix them.


Leaving the event I though I was on my way to the after party when I realized my 1990 Volvo wouldn’t start. Some people would be  upset and frustrated but looking back a few days later, I was blessed.  Let me explain. Instead of being stranded by myself Jamey Johnston stayed to help. He made sure I would be okay.  The bonus in this for me was that I was able to chat with Jamey and get to know him better. He is an amazing guy. I wish we had a lot more people like Jamey in our SQL Server Community!