One of the very misunderstood values that you can monitor in Microsoft SQL Server is Page Life Expectancy (PLE). I’ve read online that many people think that the value of Page Life Expectancy is worthless to monitor. And that really isn’t true. While Page Life Expectancy won’t tell you a specific problem by itself, it can be used to point you in the correct direction so that you know where to look.
The first thing to do in order to make use of the Page Life Expectancy value is to measure a baseline and keep that baseline where you can refer to it. So what can we learn from the Page Life Expectancy?
- If PLE is randomly dropping and then recovering this means that someone is running a one-off query that is either hitting a bunch of rows (pages) that aren’t in the buffer pool. If this is happening every once and a while it probably doesn’t mean anything.
- If PLE has dropped and is staying down, then either the load of the server has increased, or indexes have probably been dropped that are needed by the system.
- If PLE is climbing at the rate of 1 per second and PLE is low, then the server was probably just restarted.
- If PLE is climbing at the rate of 1 per second and PLE is high, and the PLE stays high, then the server probably has more RAM than is usable by the SQL Server.
- If PLE is stable and it is near the value of the baseline, then there is probably no memory pressure on the server.
One thing to remember is that when a server has multiple NUMA nodes things get a little more difficult to measure as SQL Server can use the nodes differently at each SQL Server restart. So if the SQL Server has multiple NUMA nodes care is going to be needed when monitoring and reviewing the PLE numbers.