Recent network performance enhancements in the Linux kernel
There are a number of new options for improving network performance starting in the 6.x Linux kernels that are covered in detail in our 2024 INDIS workshop paper, "Recent Linux Improvements that Impact TCP Throughput: Insights from R&E Networks".
A short summary of these improvements are below.
Kernel upgrades
Using the newest Linux kernel (6.8 in September, 2024), we see up to 38% improved performance on the WAN, and 30% improvement on a LAN, compared with the 5.15 kernel.
See this page for instructions on installing the latest supported kernels.
Receiver HW GRO
New receiver side optimizations are available for Nvidia ConnectX-7 network cards with firmware >= 28.42.1000 on Linux 6.11, which include receiver side hardware accelerated GRO and header-data split. Other new NICs from Intel and Broadcom might support this as well.
Preliminary results from the developer suggests up to 60% throughput improvement for single stream tests.
Our initial results show a 33% improvement on AMD hosts (40 Gbps vs 53 Gbps), and a 5% improvement (62 Gbps vs 65 Gbps) on Intel hosts after enabling hardware GRO on the receiver for single stream tests with a 9K MTU.
For tests with a 1500B MTU on Intel hosts we saw an impressive 160% improvement in throughput (24 Gbps vs 62 Gbps).
To enable HR GRO, do the following:
# Note: cant do ring buffers > 4k with HW GRO. 2K buffers seem to work well.
/usr/sbin/ethtool -G eth100 tx 2048 rx 2048
/usr/sbin/ethtool -K eth100 rx-gro-hw on
Also note that when doing receiver HW GRO, using a MSS of 8K was around 5% faster than using the iperf3 default of MTU-40.
More details coming soon.
BIG TCP
(more details coming soon. See the INDIS paper for results)
To enable BIG TCP, here are the commands for IPv4 and IPv6
/usr/sbin/ip link set dev ethN gso_ipv4_max_size 150000 gro_ipv4_max_size 150000
/usr/sbin/ip link set dev ethN gro_max_size 185000 gso_max_size 185000
If you are using VLANs, you'll need to run that command for each VLAN as well.
More information at: https://lwn.net/Articles/884104/ and https://netdevconf.info/0x15/slides/35/BIG%20TCP.pdf
Some recommend setting this when doing BIG TCP:
/usr/sbin/ethtool --set-priv-flags eth100 rx_striding_rq off
Overall we have not seen big throughput improvements with BIG TCP, but CPU load is somewhat less. A problem with BIG TCP is that to it requires a custom built kernel allowing for larger values for MAX_SKB_FRAGS to see the full advantage of BIG TCP.