Globus Online and the Science DMZ as Scalable Research Data Management Infrastructure for HPC Facilities

This tutorial was featured as a part of the SC13 Conference in Denver CO, and was given by Globus Online's Steve Tuecke, Raj Kettimuthu, and Vas Vasiliadis as well as ESnet's Eli Dart.

Summary of the Tutorial:

The rapid growth of data in scientific research endeavors is placing massive demands on campus computing centers and high-performance computing (HPC) facilities. Computing facilities must provide robust data services built on high-performance infrastructure, while continuing to scale as needs increase. Traditional research data management (RDM) solutions are typically difficult to use and error-prone, and the underlying networking and security infrastructure is often complex and inflexible, resulting in user frustration and sub-optimal use of resources. An increasingly common solution in HPC facilities is Globus Online deployed in a network environment built on the Science DMZ model. Globus Online is software-as-a-service for moving, syncing, and sharing large data sets. The Science DMZ model is a set of design patterns for network equipment, configuration, and security policy for high-performance scientific infrastructure. The combination of user-friendly, high-performance data transfer tools, and optimally configured underlying infrastructure results in enhanced RDM services that increase user productivity and lower support overhead. Guided by two case studies from national supercomputing centers (NERSC and NCSA), attendees will explore the challenges such facilities face in delivering scalable RDM solutions. Attendees will be introduced to Globus Online and the Science DMZ, and will learn how to deploy and manage these systems.

