fasterdata.es.netfasterdata.es.netESnet Network Performance Knowledge Base

Interrupt Binding

On a system with multiple 10G NICS, a 20-30% performance increase can be obtained by ensuring that the NIC driver interrupts are handled by a 'nearby core' as the read process/thread. On a host with 40G NICS on a "Sandy Bridge" motherboard, the performance difference between the best core and the worse core can be almost double. This is even more important for UDP-based applications, where performance improvements of up to up to 3x have been observed.

On Linux, you can use the sched_setaffinity() system call or the numuctl command line tool to bind a process to a core. To specify which core handles the NIC you need to disable irqbalance, and then bind the interrupts to a specfic core.

Some vendors provide scripts to do this IRQ binding at boot time.

  • Mellanox: /usr/sbin/set_irq_affinity_bynode.sh 1 ethN
  • Chelsio: /sbin/t4_perftune.sh
Binding interrupts by hand

Here is information on how to bind the interrupts, in case you are using a NIC that does not come with a script for this.

First, identify the irqs for the receiving queues for each  interface:

  grep eth2 /proc/interrupts
  76:         23         50        245         66         20        125         10          0          0          0          5          0       PCI-MSI-X  eth2-TxRx-0
84:         90        135         45        123         70         50          5          0          0          0          5          0       PCI-MSI-X  eth2-TxRx-1
92:        165         65         55         65        128         35          0          0          0          5          0          5       PCI-MSI-X  eth2-TxRx-2
100:         85        123         40         45         70        150          0          5          0          5          0          0       PCI-MSI-X  eth2-TxRx-3
108:        105         40         20        153        110         80          0          0          0          0         10          5       PCI-MSI-X  eth2-TxRx-4
116:        170        125         55         35         70         53          0          0         10          5          0          0       PCI-MSI-X  eth2-TxRx-5
124:         85        115         43         45         70        150          0         15          0          0          0          0       PCI-MSI-X  eth2-TxRx-6
132:        100         35         90        140         63         80         10          0          0          0          0          5       PCI-MSI-X  eth2-TxRx-7
140:        165        108         50         95         55         35          0          0          0          5          0         10       PCI-MSI-X  eth2-TxRx-8
172:          2          0          0          0          0          0          0          0          0          0          0          0       PCI-MSI-X  eth2:lsc

The IRQ is the first column.

Then bind those IRQ to a given processor:

  echo "proc_number" > /proc/irq/irq_number/smp_affinity

where proc_number is a bit mask of the core # (e.g.: core 2 = 04, core 3 = 08, core 4 = 16, etc) and irq_number is the irq (76,84...).

  echo 04 > /proc/irq/76/smp_affinity

This will bind eth2-TxRx-0 to processor 2.

This sample boot script can be used to assign NIC interrupts to cores at boot time. e.g.:

# Bind eth2/eth3 Myricom IRQs to cores 2 and 3
/usr/local/bin/myri-irq-bind.sh eth2 4
/usr/local/bin/myri-irq-bind.sh eth3 8

You will also want to bind interrupts for your RAID controllers to other, unused, cores.

To test the effect of doing IRQ binding, use mpstat. E.G:

    mpstat -P ALL 1

For more information, see the Redhat Documentation and the Mellanox Performance Tuning Guide.