Having too many NUMA nodes can lead to some really strange CPU behavior. What you’ll probably see is that the CPU workload of the machine isn’t evenly balanced across the CPUs. If for example you have 32 cores on the VM, and you have 16 NUMA nodes, with two cores per NUMA node you’ll probably see two cores (might be four, or six it depends on a lot of factors) running very hot compared to the other cores on the server.
At one client I was working with in 2015 they had a server which was configured very similarly to what I’ve described above. They had 32 cores, with 11 NUMA nodes. One NUMA node has 4 cores, the next 9 had three cores each, while the 11th NUMA node had one core in it.
When monitoring the performance on the server what we saw was that three cores were running at 100% (or close to it) while the other cores were running at about 4%.
The reason for this, is that SQL Server is NUMA aware, and it was trying to get the best performance possible out of the configuration that it had access to.
The solution in this case was to reconfigure the virtual machine so that it had only two vNUMA nodes. We did this by changing the number of virtual sockets to 2, and put 16 virtual cores on each virtual socket. Once this was done CPU load on the server was much more balanced across the cores. Now instead of 3 cores running at 100% 16 cores were handing the same workload (that workload was still isolated to a single NUMA node, but now the node had 16 cores in it) but the CPUs were bouncing around the 15-20% range. Suddenly the box was happier, and more importantly the application which the end users was using was behaving much better and queries were responding much faster as they no longer needed to wait for a CPU to become available.
Now you may be asking yourself how a system could end up with a screwy NUMA configuration like this. And the answer is pretty simple. The system grew over time, and more and more cores were added to the server little by little as the years went by. As the system needed more CPU resources more cores were added, but no attention was paid to how those cores were presented to Windows. So we ended up with Windows seeing 32 cores in a very strange configuration.
Because most SQL Servers are running inside VMs these days, we need to be careful how we configure those VMs to ensure that we get the proper configuration of the virtual machine. If we don’t have the correct configuration of the virtual hardware there’s little hope of getting good performance out of the VM.
Check your NUMA node configuration to make sure that it’s correct and if it isn’t correct it. Especially if the configuration is very odd looking. And if you aren’t sure, check with us, we’re happy to help.
The post How is performance impacted by having too many NUMA nodes? appeared first on SQL Server with Mr. Denny.