100G Benchmarking
When doing 100G (or higher) benchmarking, you'll want to be able to generate consistent results. At 100G speeds, you will likely be CPU limited on the receiver, and every core gives slightly different performance (as described in this paper). Therefore it is necessary to explicitly map not only the application cores, but also the IRQ cores. If you don't do this, there is a good chance that the IRQ for 2 of your streams will end up on the same core, greatly impacting performance.
To optimize single stream performance, you can explicitly map IRQs and the application to adjacent cores on the appropriate node. E.G.:
cat /sys/class/net/$ETH/device/numa_node. # get NUMA node to use
numactl --hardware # find adjacent cores on that NUMA node
set_irq_affinity_cpulist.sh 8 $ETH
iperf3 -c hostname -A 9,9 -Z
Another strategy that works well is to separate IRQ cores from application cores on the correct NUMA node. For example:
set_irq_affinity_cpulist.sh 0-7 $ETH numactl -C 8-15 iperf3 (client and server)
For example, using the following commands we were able to get 180 Gbps between hosts with 200G NICS:
set_irq_affinity_cpulist.sh 0-7 $ETH # do this on both client and server numactl -C 8-15 /usr/local/bin/iperf3 -s -D numactl -C 8-15 /usr/local/bin/iperf3 -P 8 -c hostname --fq-rate 23G -t 60
Note the use of pacing in this example. This is needed to help not overwhelm the receive host, leading to TCP retransmits.
For parallel streams you can also assign the IRQs explicitly using ethtool to configure receive network flow classification rules.
For example, if this is your list of cores connected to your NIC (e.g.: eth100):
> cat /sys/class/net/eth100/device/local_cpulist
0-5,12-17
And you want to place your receive IRQs on cores 1-4, and run iperf3 on cores 12-15, do the following:
Map ports 5002-5005 to network queues (cores) 1-4:
ethtool -U eth100 flow-type tcp4 dst-port 5002 action 1 ethtool -U eth100 flow-type tcp4 dst-port 5003 action 2 ethtool -U eth100 flow-type tcp4 dst-port 5004 action 3 ethtool -U eth100 flow-type tcp4 dst-port 5005 action 4
Then use the iperf3 "-A" flag to specify the cores for both the client and server. For example:
Start servers:
iperf3 -s -p 5002 & ; iperf3 -s -p 5003 & ; iperf3 -s -p 5004 & ; iperf3 -s -p 5005 & ;
Run clients:
iperf3 -T "core 12" -c $host -t60 -A 12,12 -p 5002 -b 25G &; iperf3 -T "core 13" -c $host -t60 -A 13,13 -p 5003 -b 25G &; iperf3 -T "core 14" -c $host -t60 -A 14,14 -p 5004 -b 25G &; iperf3 -T "core 15" -c $host -t60 -A 15,15 -p 5005 -b 25G &
This should give you consistent, repeatable results on a clean network.
Other useful ethtool commands for managing flow filter rules:
- To view results: ethtool -u eth100
- To delete these rules: ethtool -U eth100 delete