How familiar are you with the SQL Server buffer pool and operations like page faults? Do you know what value is “good” when checking page life expectancy? Let’s take a look at some of these areas.
Life is Cool by the Pool
The buffer pool in SQL Server is an area in memory for caching data. Once data is read from disk, it can be kept in the buffer pool and SQL Server can check here for data when it needs to be found in the future. If requested data is not found in the buffer pool, a hard page fault can occur, meaning data needs retrieved from disk. It’s possible that data is found somewhere else still in memory which is called a soft page fault.
Time and resources can be saved if data is retrieved from the buffer pool instead of having to go to disk each and every time. Hard page faults are going to happen but you’ll want them to be as minimized as possible. What is the acceptable amount to have? As with many SQL Server metrics, it depends on the environment. Have a baseline documented for your own environment to compare against.
What about Page Life Expectancy?
You may have heard of the term page life expectancy. This is how long you can expect pages to survive in the buffer pool before needing to be removed so that new data can come in. This is also another metric with no one-size-fits-all value to be happy or concerned with. A threshold of 300 has been suggested for over a decade but a quick Google search will show you that 300 is vastly outdated in relation to today’s more common server environments.
In general, a higher number is better but it’s up to you if you want to know your baseline and your own “good” value. If you are monitoring page life expectancy and see a sudden shift compared to the baseline, you may have (or already had) some issues. One way to check is with the following query:
SELECT * FROM sys.dm_os_performance_counters WHERE [counter_name] = 'Page life expectancy'; GO
The cntr_value will be in noted in seconds.
What to Watch For
While keeping an eye on page life expectancy isn’t a complete waste of time, values may not be the most reliable. As an alternative, put more effort into focusing on memory related waits like PAGEIOLATCH. Use Brent Ozar’s First Responder Kit to run:
sp_BlitzCache @SortOrder = 'reads';
This will show queries with the most total reads and a higher chance of needing to go to disk for data that’s not in the buffer pool. See if these top queries can be optimized for better performance.
Thanks for reading!