Science DMZ: Data Transfer Nodes
Data Transfer Nodes
The computer systems used for wide area data transfers perform far better if they are purpose-built and dedicated to the function of wide area data transfer. These systems, which we call Data Transfer Nodes (DTNs), are typically PC-based Linux servers built with high-quality components and configured specifically for wide area data transfer. The DTN also has access to local storage, whether it is a local high-speed disk subsystem, a connection to a local storage infrastructure such as a SAN, or the direct mount of a high-speed parallel filesystem such as Lustre or GPFS, or a combination of these. The DTN runs the software tools designed for high-speed data transfer to remote systems – typical software packages include GridFTP and its service-oriented descendent Globus Online, discipline-specific tools such as XRootd, and versions of default toolsets such as SSH/SCP with high-performance patches applied.
DTNs typically have high-speed network interfaces (10Gbps currently, though experiments with 40Gbps DTNs are already underway), but the key is to match the DTN to the capabilities of the site and wide area network infrastructure. So, for example, if the network connection from the site to the WAN is one gigabit Ethernet, a 10 gigabit Ethernet interface on the DTN may be counterproductive.
To mitigate security risks, no general-purpose computing tasks are allowed on the DTN – no web browsers, no media players, no business productivity tools such as document and spreadsheet editors, no email clients, etc. which require all the security controls necessary to ensure the safety of general-purpose computing in today’s environment.
More information on DTN design is available in the DTN section of the tutorial "Achieving a Science DMZ," given by engineers from ESnet and the University of Utah at the ESnet/Internet2 Joint Techs conference in January 2012 at Baton Rouge, LA.