Hardware
Note: This page is under construction, and meant to replace the old hardware selection page, which is out of date.
Please provide feedback on this new content. What is missing?
---------------------------
If you are buying a DTN node today (2025 as of this writing) and want your server to have a 4-5 year lifespan, you should plan for a system that will eventually have a 100G+ NIC, even if you have no immediate plans for 100G at your site, as the cost of 100G networking continues to fall.
As with any hardware purchase, there is no "best configuration". It all depends on your exact use case and budget. The following are some general guidelines.
Essential Hardware Features
CPU Recommendations
Choose a CPU optimized for high-throughput workloads:
- AMD Options:
- Best: EPYC 9004 series (Genoa). Supports PCIe Gen 5 and AVX-512.
- Cost-effective: EPYC 7003 series (Milan). Supports PCIe Gen 4, but lacks AVX-512.
- Intel Options:
- Best: Xeon 4th Gen (Sapphire Rapids). Supports PCIe Gen 5 and AVX-512.
- Cost-effective: Xeon 3rd Gen (Ice Lake). Supports PCIe Gen 4 and AVX-512.
- Clock Speed: In general, get the fastest clock speed you can afford. At least 3.0 GHz.
- Number of Cores: At least 16 cores. Note that these processors connect PCI bus slots directly to a processor, causing potential performance hits in multi-processor systems if a process runs on one CPU but I/O interrupts occur on another. Consider getting a single node motherboard rather than a dual-node to avoid the need to tune IRQs.
- AVX-512 matters: The AVX-512 instruction set improves network throughput and is strongly recommended for DTNs. Note that the AMD Milan processors do not support AVX-512, but the others all do.
PCIe Recommendations
Fast PCIe lanes are critical for supporting high-performance NICs and storage:
- PCIe Gen 4: Sufficient for 100G or 200G networking.
- PCIe Gen 5: Necessary for 400G networking and beyond.
Memory Recommendations
Fast memory is needed for high-performance DTNs:
- DDR4: Adequate for most use cases.
- DDR5: Recommended if you someday want to scale to 400G networking, and workloads requiring extreme scalability.
- Capacity: 128GB to 256GB is recommended for high-performance DTNs.
Storage Recommendations
Achieving high disk bandwidth is critical for 100G+ data transfers. Be sure your storage bandwidth matches your network bandwidth:
- Use NVMe SSDs with read speeds of 6,000 MB/s or more. Recommended models include:
- Micron 9400
- Intel P5510
- Samsung PM9A3
- Combine multiple NVMe drives in a RAID configuration to further increase throughput. Be sure to get a motherboard that supports NVME RAID.
- The ideal storage configuration for a DTN depends on your application and needs. For a short-term cache, RAID0 (no redundancy, high performance) is suitable since the DTN isn't responsible for archival storage—data can be reloaded from the authoritative source if the filesystem crashes. For reliable storage, consider RAID5, RAID6, or RAID10 (RAID0 of multiple RAID1 mirrors).
NIC Recommendations
Options for 100G+ NICs include:
- NVIDIA/Mellanox:
- ConnectX-7 (400G/100G): Ideal for PCIe Gen 5 systems; supports HW Receive GRO starting with Linux kernel 6.11+.
- ConnectX-6 (200G/100G): Suitable for PCIe Gen 4 systems, but less future-proof.
- Intel: E810 series.
- Broadcom: P2100G and P2200G series.
Notes:
- Currently we only have experience with NVIDIA/Mellanox
- If you have a PCIe Gen 5 host, prioritize the ConnectX-7 over older models like the ConnectX-6.
- Some NICs offer dual-port models, but using both ports simultaneously does not double performance; the second port is often intended as a backup.
Additional Considerations
- Chipsets and Expandability:
- Choose server-grade chipsets with extensive memory support, including ECC memory for error correction.
- Ensure sufficient PCIe slots for future upgrades.
- Thermal Management:
- High-speed servers generate significant heat. Opt for systems with robust cooling solutions.
- Remote Management:
- Features like IPMI (Intelligent Platform Management Interface) simplify maintenance and monitoring.
More Information
- external links go here