Interface Speed Mismatch Issues
For some network paths, sending from fast host to a slow host can cause problems. For example, a 10G host to a 1G host, or a 100G host to a 10G host. Here is an example showing tests between a 10G host and a 1G host:
A network interface card is designed to operate in two modes:
- If data is present, sent at line rate
- If data is not present, send nothing
This binary operation means that a smooth "average" speed is never possible, and as such devices in the path must be ready to accept the line rate burst of data, or risk dropping some due to a lack of buffer space. The speed mismatch happens when there is a lack of available network buffering to handle the influx of data and drop packets, as shown by the blue line above. The "slower" host sending to the "faster" host does not see this issue, and performance is fine. Using the tools tcpdump, tcptrace, and XPlot, we can see the reasons for this performance difference by observing the behavior of TCP:
The bursts of red indicate TCP retransmissions, and the purple is TCP SACK blocks. This stalling leads to the often unpredictable graphs that show a low average throughput.
This situation has been shown to be true in several environments:
- 10G to 1G host
- 10G host to a 2G virtual circuit
- 100G to 10G host
- Fast host to slower host (based on CPU speed)
Note that not all 10G to 1G mismatches will produce this behavior. If there is sufficient buffering in the network (e.g. enough to handle the 10Gbps bursts that the faster network card may produce), the problem will not exist. Consider this example for a 10G to 1G test on a network path with sufficient buffering:
The tcpdump for this path shows a typical TCP linear growth pattern - indicative of no stalls during the length of the transfer due to packet loss:
Many testing tools feature the ability to "pace" their traffic (e.g. nuttcp and iperf3 allow this behavior). While this works well for UDP, it is hard to accurately pace TCP due to the way the kernel does buffering. The linux tc command does a much better job of pacing TCP flows, and is recommended to help with speed mismatch issues.
For information on using tc for pacing, see: https://fasterdata.es.net/performance-testing/packet-pacing/.
A final graph shows tcpdump results when using tc to pace the sender speed to the receive host speed, and deliver much better performance:
Newer versions of the Linux Kernel are better at handling speed mismatch issues. The following graph of results shows sending from a 40G host to a 10G host, using both Linux Kernel 3.10 (CentOS 7) and Linux Kernel 4.2 across a ~50ms path: