This page includes information on tuning and monitoring CPU core usage. It is mainly needed to achieve speeds over 10Gbps.
Determining CPU Limitations
If you are not getting the performance you expect, use 'mpstat -P ALL 1' to determine if you are CPU limited.
For example, here is a nuttcp UDP test using the suggested command line options, and the result is 5.9 Gbps:
Note that nuttcp reports 99% CPU on the transmit host. mpstat on the receiving host confirms that core 6 is not saturated:
mpstat on the sending host confirms that core 6 is saturated:
For these hosts, running multiple nuttcp clients on different cores will increase total throughput.
Not all cores will get the same throughput, as the memory copy speed varies from core to core, as shown in the diagram below. This diagram shows single stream throughput on a host with a 40G NIC based on which core the application was using and which core the NIC was using. In this case throughput ranged from 21Gbps to 40Gbps, depending on core selection. Typically you'll want the application and the IRQ on 'nearby' core, but not the same core.
For more information: