Menu

40G/100G Network Tuning

March 26, 2024

For hosts with 40G/100G Ethernet NICs, in addition to the changes to sysctl.conf, there are some additional things you'll want to tune to maximize throughput.

The most important things to configure are:

/usr/sbin/ethtool -G ethN rx 8192 tx 8192
/usr/sbin/ethtool -C ethN adaptive-rx on adaptive-tx on
  • Disable Simultaneous Multithreading (SMT) (AKA Hyperthreads) in the BIOS. We've seen SMT lead to very inconsistent results, especially with AMD-based hosts.
    • To confirm SMT is off, this command should return zero:
      cat /sys/devices/system/cpu/smt/active 
    • To temporarily turn SMT on/off for testing, you can do this:
      echo off > /sys/devices/system/cpu/smt/control 
  • Make sure that 'fair queuing' (fq) is enabled, and set a good pacing rate for your environment.  Most newer versions of Linux set net.core.default_qdisc to fq_codel, which seems to work fine. Some older versions have a default of pfifo_fast, and do not support fq_codel. These should should be changed to fq.
  • Enable IOMMU if your hardware supports it. This is a very important setting, and can improve performance by up to 40%.
  • Make sure that flow control (pause frames) is turned on, as not all NIC drivers have this on by default (e.g.: Intel ICE driver)
    /usr/sbin/ethtool -A ethN rx on tx on 

No other tuning should be needed for modern Linux OS's (systems with a 5.x kernel).

For more details on 100G tuning on older systems, see this presentation from September 2016.

CPU clock rate matters a lot for 40G/100G flows  If you care about the throughput of single flows, a higher CPU clock rate is  important.  In general, you need a CPU clock rate of at least 3GHz to achieve 30Gbps per flow.

On the ESnet 100G perfSONAR nodes we typically see around 30 Gbps single stream, and can easily get over 95Gbps using 8 streams with both iperf2 and the new threaded version of iperf3. Pacing is helpful so that the streams do not step on each other. 

For information on DTN file system tuning, see DTN Tuning.

More information on NIC vendor-specific tuning recommendations: