Cisco Configuration Hints
There are some common configuration tasks for enabling high-performance data transfers through Cisco routers, in particular the Catalyst 6500/7600 series. Unless otherwise stated, the information on this page that is specific to a particular router platform refers to Cisco Catalyst 6500/7600 routers based on the Supervisor 720, a model of supervisor and forwarding plane for those platforms commonly found in university, national laboratory, and science facility networks. Some users have reported that these settings will not work on the Nexus 7K/9K platform that uses NX-OS; more information on this platform can be found at http://people.ucsc.edu/~warner/Bufs/nexus-7K and http://people.ucsc.edu/~warner/Bufs/nexus-9300.
The Cisco Catalyst 6500/7600 switch-routers have default settings that result in underutilization of the packet memory on their interfaces. Specifically, IOS defaults to a 40-packet output queue. This means that it is very easy to make the router drop packets when it is carrying wide area TCP flows, such as flows that result from high-performance bulk transfers of large science data sets. For example, when high-speed packet bursts arrive at the router from a 10G interface and exit the router via a 1G interface, it is possible for packets to enter the router from the 10G interface faster than the 1G interface can send them on. This means that the 1G interface needs to buffer packets or drop them. A similar situation occurs when traffic enters the router from two different interfaces and the two traffic streams are forwarded out a common egress interface – momentary oversubscription of the output interface can result, and if the output queue is too shallow then packets are dropped. A 40-packet output queue is typically inadequate under these circumstances.
We have found that adding the following configuration to the 10G interfaces on the router helps a lot: hold-queue 4096 out 1G interfaces can benefit from “hold-queue 1024 out” as well. This can be configured on input as well, e.g. “hold-queue 1024 in” in the same place in the configuration. Looking at the output of “show interface” can tell you the size of the interface queues. Check before you make changes, since some interfaces default to a 2000-packet input queue.
The Catalyst 6500 has very little in the way of queuing resources on its line cards, but increasing the output queue depth will merely use the hardware that exists on the card - the default config does not even use the hardware resources that are available. There is no danger of resource exhaustion due to applying hold-queue commands to interfaces. We have been through this with many sites, both general lab sites and LHC-specific infrastructures, and we have yet to hear of adverse effects from increasing the queue depth on Catalyst 6500s. ESnet ran all our WAN 6500s and 7600s (back when we had them) with deep output queues, and the SCinet network (the supercomputing conference network) has run their Catalyst 6500s this way for years.
On a Cisco Catalyst 6500, when the number of routing entries exceeds the available CEF memory space, an exception is generated, which will force some traffic to be process switched. If the routing table is subsequently reduced to free up memory space, the CEF exception is not cleared and process switching will continue. A reboot is required to clear the exception and restore normal hardware forwarding.
To check CEF exception status:
show mls cef exception status
To check available CEF memory allocation:
show mls cef maximum-routes
There are configuration commands to change the allocation between IPv4 and IPv6 routes.
The Nexus 7K/9K platform uses NX-OS and more information on this platform can be found at http://people.ucsc.edu/~warner/Bufs/nexus-7K. There is variability in the performance of htis platform, depending on the family of card that is used for networking. The board is 1.5MB per VoQ/port and configurable to 12MB shared (across a number of ports on an ASIC), and there is 144MB total for all interfaces. In general the performance hinges on the model used:
- The F3 blades are designed for low latency/high throughput, but the buffering is not as good for outward facing, longer RTT connections.
- M2 blades have better buffer capabilities, as long as attention is paid to not oversubscribing port groups.
The general use case for these devices is intended to be as a data center switch For impedance matching (between unlike speeds or for long RTTs), they are less good of a fit.
The Cisco Aggregation Services Routers (ASR) function as Edge and Carrier Ethernet devices. More information on the buffering design of these, in particular the ASR9xxx series, can be found here:
Some users have reported issues on 100G hardware that limits per-flow data rate to between 12Gbps and 13Gbps due to channelization on the backplane. It is also possible that backpressure messages may not be delivered across the ASIC between the farbic and network processor (see: https://quickview.cloudapps.cisco.com/quickview/bug/CSCuv70838) which is then exacerbated by the hashing collisions of flows.
If you suspect that packets from a flow are being dropped/limited, its best to use the CLI to issue 'show drops' commands. Note there are many variants of this - depending on the portion of the router you are looking to examine. Checking the Fabric and NPU counters is also a good idea. There are several references available on this platform:
- https://www.scribd.com/doc/165357532/ASR-9000-Hardware-Architecture-QOS-EVC-IOS-XR-Configuration-and-Troubleshooting-BRKSPG-2904 (pages 54-57)