Science DMZ: Data Transfer Nodes
Data Transfer Nodes
The computer systems used for wide area data transfers perform far better if they are purpose-built and dedicated to the function of wide area data transfer. These systems, which we call Data Transfer Nodes (DTNs), are typically PC-based Linux servers built with high-quality components and configured specifically for wide area data transfer. The DTN also has access to local storage, whether it is a local high-speed disk subsystem, a connection to a local storage infrastructure such as a SAN, or the direct mount of a high-speed parallel filesystem such as Lustre or GPFS, or a combination of these. The DTN runs the software tools designed for high-speed data transfer to remote systems – typical software packages include Globus, discipline-specific tools such as XRootd, and versions of default toolsets such as SSH/SCP with high-performance patches applied.
DTNs typically have high-speed network interfaces, but the key is to match the DTN to the capabilities of the site and wide area network infrastructure. So, for example, if the network connection from the site to the WAN is one gigabit Ethernet, a 10 gigabit Ethernet interface on the DTN may be counterproductive. Similar arguments can be made for other impedance mismatch situations (100Gbps on a 10Gbps WAN link, etc.).
To mitigate security risks, no general-purpose computing tasks are allowed on the DTN – no web browsers, no media players, no business productivity tools such as document and spreadsheet editors, no email clients, etc. which require all the security controls necessary to ensure the safety of general-purpose computing in today’s environment.
We include some information on these pages to help with hardware selection and tuning of a DTN node. We also have information on software and performance testing.
More information on DTN design is available in the DTN section of the tutorial "Achieving a Science DMZ," given by engineers from ESnet and the University of Utah at the ESnet/Internet2 Joint Techs conference in January 2012 at Baton Rouge, LA.
Deployment Strategy - Quantity and Capability
Figuring out 'how large' a DTN should be is a function of several inputs:
- How many users are expected for the DTN resources?
- Will the DTN have all storage 'local', or be integrated with external storage resources?
- What is the available wide area connectivity for the site?
- What is the available local area connectivity?
When building a DTN, it is tempting too big it 'as big as possible', e.g. maxing out all available technology. This can mean buying a 100Gbps network connection, or as much RAM/Storage as will fit inside of the machine all at once. There are downsides to this approach:
- Configuring a 100Gbps DTN, on a network that can only support 10Gbps (etc.) outbound can cause local congestion, which is counterproductive to operations
- Spending a large budget on a single resource may cause scalability problems, particularly as the number of users for the DTN increases
There are two possible deployment strategies for DTNs to consider:
- Create performant, yet 'smaller' pools of DTN resources. The deployment mode would be to add more resources as required (to scale load, or as new users with different usage requirements emerge) without overtaxing network capabilities
- 10Gbps capable networking
- CPU scaling to support multiple flows between 1-10Gbps (e.g. 6-8 cores, 3.2GHz or greater)
- Sufficient local storage + options to connect external
- Create a single, high performing and 'future proof' DTN resource. This deployment mode assumes a single resource using all available bandwidth, and scalability would come internally (e.g. docker images on board, or shared application space).
- Largest possible NIC
- Fast CPU (3.6 Ghz or greater to support higher speeds)
- Multiple CPU cores (e.g. 12+ to scale with concurrent usage patterns)
- Sufficient local storage + options to connect external