When doing 100G benchmarking, you'll want to be able to generate consistent results. At 100G speeds, you will likely be CPU limited on the receiver, and every core gives slightly different performance (as described in this paper). Therefore it is necessary to explicitly map not only the applications cores, but also the IRQ cores. In fact if you don't do this, there is a good change the the IRQ for 2 of your streams will end up on the same core, greatly impacting performance.
Fortunately there is a way to assign the IRQs explicitly using ethtool to configure receive network flow classification rules.
For example, if this is your list of cores connected to your NIC (e.g.: eth100):
> cat /sys/class/net/eth100/device/local_cpulist
And you want to place your receive IRQs on cores 1-4, and run iperf3 on cores 12-15, do the following:
Map ports 5002-5005 to network queues (cores) 1-4:
> ethtool -U eth100 flow-type tcp4 dst-port 5002 action 1 > ethtool -U eth100 flow-type tcp4 dst-port 5003 action 2 > ethtool -U eth100 flow-type tcp4 dst-port 5004 action 3 > ethtool -U eth100 flow-type tcp4 dst-port 5005 action 4
Then use the iperf3 "-A" flag to specify the cores for both the client and server. For example:
iperf3 -s -p 5002 & ; iperf3 -s -p 5003 & ; iperf3 -s -p 5004 & ; iperf3 -s -p 5005 & ;
iperf3 -T "core 12" -c $host -t60 -A 12,12 -p 5002 -b 25G -Z &; iperf3 -T "core 13" -c $host -t60 -A 13,13 -p 5003 -b 25G -Z &; iperf3 -T "core 14" -c $host -t60 -A 14,14 -p 5004 -b 25G -Z &; iperf3 -T "core 15" -c $host -t60 -A 15,15 -p 5005 -b 25G -Z &
This should give you consistent, repeatable results on a clean network.
Other useful ethtool commands for managing flow filter rules:
- To view results: ethtool -u eth100
- To delete these rules: ethtool -U eth100 delete <rule number>