Guide to Bulk Data Transfer over a WAN

Workshops
Search
Firewall Issues

Many sites run firewalls that prevent file transfer tools from working. In particular, protocols such as FTP which use dynamically assigned ports often get blocked by the firewall.

Often firewalls are configured to only block incoming connections, not outgoing connections. In this case you may be able to solve the firewall problem by initiating the transfer from inside the site with the firewall. FTP 'passive' mode reverses the direction of the connection, and can be used in this case as well.

If both sites have a firewall that blocks incoming connections, things are trickier. You probably will have to talk to your firewall administrator about opening up set of ports for your data transfer connections. You might also consider placing your data server outside the firewall. This has the added benefit of avoid potential performance issues caused by the firewall (see below).

Below is specific information for GridFTP and hsi on how to configure specific port usage.


How to specify port ranges for GridFTP

You can specify the ports that the GridFTP server uses by editing the file:

      /etc/grid-security/sshftp

Modify GLOBUS_TCP_PORT_RANGE to the ports you want. For example:

      GLOBUS_TCP_PORT_RANGE=50000,50050

To specify with ports are used by the client, you can modify the file:

      $GLOBUS_LOCATION/libexec/gridftp-ssh

Look for the line:

      /usr/bin/ssh $port_str $remote_host $remote_program 

use something like this instead:

      /usr/bin/ssh $port_str $remote_host GLOBUS_TCP_PORT_RANGE=x,y;$remote_program 

Firewalls and hsi/HPSS:

    Try using the firewall -on option to hsi/htar. More information is available from NERSC.


Firewall Performance Issues

    Firewalls are often slower than the link speed of their network interfaces (e.g. many firewalls with Gigabit Ethernet interfaces have a maximum throughput rate of 800 Mbps). This causes a problem when a host with a network interface that is faster than the firewall's internal processor attempts to send data through the firewall (TCP bursts typically occur at or near the maximum data rate of the sending host's interface). Since the firewall must buffer the traffic bursts sent to it by the data transfer host until it can process all the packets in the burst, input buffer size is critical. Unfortunately firewalls often have small input buffers, since they are typically designed to scale to large numbers of low-speed flows, rather than a few high-speed data flows. If the firewall's input buffers are too small to hold the bursts from the data transfer host, packet loss will result -- often causing severe performance problems.


© 2007, ESnet