Microsoft SQL Server has a much-maligned performance monitoring (perfmon) counter called Page Life Expectancy (PLE). What this counter monitors is how long SQL Server will expect to keep your data in memory.
Now it’s true that looking at this monitor right after the SQL Server has been restarted is kind of pointless. This is because when SQL Server starts up, the PLE should increase one value for every second that passes until the cache is completely full. This what we would expect it to do, as the PLE counter doesn’t know anything from before SQL Server was started. And since it has no historical information to work off of. So given that, it knows that SQL Server was started 10 seconds ago, so the PLE needs to be 10 seconds. One second after that, it’ll be 11 seconds, because again that’s all the information that the SQL Server instance has to work with. So when should we be looking at PLE?
The first time that we want to be monitoring PLE is to get a baseline for what we expect this counter should be reporting. If we don’t know what the baseline data is then we don’t know if a problem happens when the value changes.
When the value changes dramatically
The PLE value doesn’t have a target number that we watch for. What we are looking for with changes to PLE are changes to the baseline (see above). When PLE drastically changes is that drop normal, or is that something that is out of the ordinary? If PLE for the server is normally 5,000 and it drops to 65, is that drop normal or is it something that we need to look at?
One of the things we can do with PLE is look for trends. If the PLE drops every morning at 3am, then we need to see what’s happening at 3am that is causing the SQL Serer to flush the buffer pool each night at 3am.
NUMA Node Imbalance
PLE is tracked at the instance level, but this is just the result of a math formula. The actual raw PLE number is gathered at the NUMA node level. If we look at the NUMA node level PLE values and compare them to each other we can see potential imbalances between the NUMA nodes, and we can dive into those imbalances and see why SQL Server is doing that. Maybe we have a server configuration issue that needs to be addressed? Maybe we need to change the soft-NUMA settings for the server?
PLE Doesn’t Give Answers
As we’ve seen PLE doesn’t give answers. Instead, PLE points you in a direction and is one of the diagnostic tools that can be used to see that there is a problem that needs to be investigated and identified. PLE doesn’t specifically tell you what the problem is. It tells you that there is a problem and that the problem needs to be investigated.
If you have SQL Server performance problems, or if you think you do, but you aren’t sure, we can help with that.