DB Options: – AUTO UPDATE STATS / ASYNC

https://youtu.be/8ugpVfPzvz4

In this video we are going to talk about auto update statistics. This is a feature in the database engine that automatically triggers a stat update when data has been updated beyond a threshold. In previous versions of SQL server this thresh hold was 500 rows followed by every 20% increase in the count of rows. Naturally this logic is flawed for large tables since it will increase the lag between stats updates. A more aggressive trigger has been implemented since 2016 where the square root for the count of rows multiplied by 1000 is used. This makes sure that the delays are fewer because the stats are updated in regular intervals.

Before we continue let’s look at why the stats need to be updated. The execution plan needs to know how many rows it is likely to fetch for a query in order to size memory , use the correct join hint , order decide to perform scan vs seek and many more. So it’s critical to have up to date counts. Think of it like money in your wallet. It makes sense to know exactly how much you have in it before you decide to go shopping.

Auto Update Stats makes sure that before executing the query SQL Checks if the stats are up to date and if not it decides to update the stats before it runs the query. Think of this like running to ATM after checking the wallet and realizing you don’t have enough cash. A side effect of this is the fact that query gets delayed. Imagine if there was a queue at the ATM. Now your query takes longer to execute simply because of an additional step however you are committed now to withdrawing money at the ATM and can’t change your mind anymore.

Often in SQL Server keeping Auto update stats ON is better simply because you know that once stats are up to date they remain usable for a long duration and therefore the initial cost is balanced out in subsequent runs. In some cases you might want to disable this because the stats update can make your query performance unpredictable. For example if the know the ATM never has a queue you might prefer to rush to the ATM every time you need cash. On the other had when the ATM (stats) often has a queue you might defer you withdrawal until another time.

This brings us to AUTO UPDATE STATISTCIS ASYNCHRONOUSLY in this case you decide to visit the ATM see the queue and your partner decides to stand in the queue while you went ahead to complete what tasks you can in the meantime. This way the current query performance remains predictable even if not ideal. In other words when AUTO UPADTE STATISTICS ASYNCHRONOUSLY is enabled SQL Server decides the run the query with out of date stats for the current execution but simultaneously triggers a Stats Update anyway so that they next execution can benefit from it. This feature made sense of older versions of SQL Server where the delay between rebuilds were significant for large tables. In newer versions of SQL Server we have more frequent runs and therefore it might not be as impactful since we might as well just wait.

Having said that we still need to analyze the impact of running a suboptimal plan until then.

So let’s analyze what values we should set these properties to. The answer is actually weird because almost all DBAs rebuild indexes and update stats using a threshold of 25-30% and in that case the stats should be sufficiently up to date to never need to use Auto Update Stats anyway. If you feel that your index maintenance plan is not as reliable then enabling the AUTO UPDATE is a good backup to half. So in that context leaving it ON is a good idea. Similar to way you pre-size the database files but still don’t disable AUTO growth.

When it comes to ASYNC update of stats the answer is a bit more straightforward. If you have a lot of AdHoc queries then ASYNC doesn’t really help you much because the chances the exact same query is executed might be less and as a result there might not be subsequent execution that might benefit from waiting. Similarly if the query or execution plan is very complicated then it makes sense to invest in a better plan from the get go than count on it performing better at a later point in time.

For example say I have a query that performs 50% slower if the stats aren’t up to date. I will want to know how frequently the data changes in the underlying tables and how frequently the query runs. If the answer to both these questions is “Often” then I will prefer to Set AUTO UPDATE STATS ON and ASYNC ON. Simply because the frequent data change might cause the execution plan to take time as the scans are synchronous.

If the answer to question is “Not Often” then ASYNC doesn’t really improve performance anyway so it’s set OFF.

The main point I want to stress here is you shouldn’t be relying on these options and focus on a good Index rebuild strategy using partitions , Filtered indexes and other options available in SQL Server instead.