The Science DMZ is an element of network architecture. The intent of the Science DMZ is to simplify the deployment and support of high-performance and data-intensive science applications that rely on high-speed networking for success. The Science DMZ is a dedicated portion of a site or campus network, located as close to the network perimeter as possible, that serves only high-performance science applications. The equipment, configuration, and security policy of the Science DMZ are tailored specifically for science applications – not for general-purpose or “enterprise” computing.
An example diagram, showing the essential components of a Science DMZ (a Data Transfer Node and a test and measurement host), is shown in the diagram below:
In the example above, the Data Transfer Node (DTN) is connected directly to a high-performance switch or router, which is connected directly to the border router. The security policy enforcement for the DTN is done using access lists on the Science DMZ switch or router, not on a separate firewall. To mitigate other risks, no general-purpose computing tasks are allowed on the DTN – no web browsers, no media players, no business productivity tools such as document and spreadsheet editors, no email clients, etc. The DTN’s job is to efficiently and effectively move science data to and from remote sites and facilities, and the infrastructure accommodates it. In return, the DTN must not use general-purpose desktop software that demands all the security controls necessary to ensure the safety of general-purpose computing in today’s environment.
The Science DMZ is connected directly to the border router in order to minimize the number of devices that must be configured to support high-performance data transfer and other scientific applications. Achieving high performance is very difficult to do with system and network device configuration defaults, and the location of the Science DMZ at the site perimeter simplifies the system and network tuning processes. Also, if there is a performance problem, it is much easier to troubleshoot a handful of devices rather than a large-scale LAN infrastructure.
In addition, the Science DMZ has a test and measurement host (labeled perfSONAR) that allows easy fault diagnosis on the Science DMZ. The perfSONAR host can run background checks for latency changes and packet loss using OWAMP, as well as periodic throughput tests to remote locations using pscheduler. If a problem arises that requires a network engineer to troubleshoot the routing and switching infrastructure, the tools necessary to work the problem are already deployed - they need not be installed before troubleshooting can begin.
Note that users at the site must still access the resources on the Science DMZ through the site perimeter firewall. Since the latency between the Science DMZ and the on-site users is so low, the issues caused by the site perimeter firewall are typically much less of problem in practical terms. TCP recovers quickly at low latencies, and short-distance TCP dynamics are different enough from the TCP dynamics in long-distance transfers that packet loss that would exist if the wide area data transfers traversed the firewall may not even exist when local users access Science DMZ resources.
The Science DMZ architecture has its roots in operational experience in several aspects of networking – design, operations, and security. In fact, the term “Science DMZ” comes from the “DMZ networks” that are a common element in network security architectures. The traditional DMZ is a special-purpose part of the network, at or near the network perimeter, which is specifically designed to host the site services facing the outside world (e.g. web, email, and authoritative DNS servers). The security policies, network device configuration, and so forth are tailored for the DMZ, and are not conflated with the security policies and configurations of the internal LAN infrastructure. The Science DMZ adapts this notion to the task of supporting high-performance science applications, including bulk data movement and data-intensive experimental paradigms. The Science DMZ is a dedicated portion of a site or campus network, located as close to the network perimeter as possible, which is designed and configured to provide optimal support for high-performance science applications.
The design of high-performance networks for science starts with the capabilities required to effectively deploy and support high-performance science applications. These include high bandwidth, advanced features, and capable gear that does not compromise on performance to get those features. Operational requirements drive the need for simplicity, accountability, accuracy, and the easy integration of test and measurement services. Security requirements come from the need to ensure correctness, prevent misuse, and the avoid embarrassment or other negative publicity that can compromise the reputation of the site or the science. The Science DMZ architecture meets these needs by instantiating a simple, scalable network enclave that explicitly accommodates high-performance science applications while explicitly excluding general-purpose computing and the additional complexities that go with it.
For More Information
See our Achieving the Science DMZ tutorial for more details on how to design and build a Science DMZ.