Menu

GridFTP Quick Start Guide

Overview 

We recommend GridFTP for transfering large files across high-speed WANs.

If one or both of your endpoints have regular GridFTP server setup, you should look into using Globus Online.

Here is a 'quick start guide' to installing GridFTP with ssh support only (i.e.: no X509 support). Do these steps on both the client and server hosts.

First install the Globus repo:

rpm -hUv http://www.globus.org/ftppub/gt6/installers/repo/globus-toolkit-repo-latest.noarch.rpm

Then do a 'yum install':

yum install yum-plugin-priorities  # globus installer needs this
yum install globus-data-management-client globus-data-management-server

Next, enable sshftp using this command:

/etc/init.d/globus-gridftp-sshftp reconfigure

The GridFTP server is now automatically launched via sshd. Make sure you can ssh to both the source and destination host. Here are some sample commands (note: you should not  be root when trying these):

 # directory listing
globus-url-copy -list sshftp://gridhost.foo.gov/tmp/
# copy file /etc/group
globus-url-copy sshftp://gridhost.foo.gov/etc/group file:/tmp/group
# parallel transfer of file /tmp/mybigdatafile
globus-url-copy -p 4 sshftp://gridhost.foo.gov/tmp/mybigdatafile \
file:/tmp/myfile
# test network throughput
globus-url-copy -vb -p 4 sshftp://gridhost.
foo.gov/dev/zero file:///dev/null

Once you have GridFTP installed, you can test it using our GridFTP test Data Transfer Nodes (you must be connected to a research and education network).

NOTE: there appears to be a bug in the GT6.0 version for ssh access. You may need to edit /usr/share/globus/gridftp-ssh, and replace @SSH_BIN@ with full path to ssh (i.e. "/usr/bin/ssh").


For More Information

Recommended: Configuring and running GridFTP
Useful: Explanation of globus-url-copy command line options



Server Logging Options

We recommend enabling GridFTP's additional logging capabilities, which make performance analysis and troubleshooting much easier. These more detailed logs can be used to identify your fastest and slowest endpoints, and see settings such as number of parallel streams and TCP buffer sizes your remote users are requesting.

To enable this additional logging, add the following to /etc/gridftp.conf:

log_level ERROR,WARN,INFO
log_single /var/log/gridFTP/gridftp-auth.log
log_transfer /var/log/gridFTP/gridftp.log
log_module stdio_ng

 Note that the log directory must be writeable by all GridFTP users.

For More Information

 

 

Recommended: GridFTP server documentation



Back to Top

Firewall Issues

Many sites run firewalls that prevent GridFTP from working. Protocols such as FTP which use dynamically assigned ports often get blocked by the firewall. Often firewalls are configured to only block incoming connections, not outgoing connections. In this case you may be able to solve the firewall problem by initiating the transfer from inside the site with the firewall.

If both sites have a firewall that blocks incoming connections, things are trickier. You will have to talk to your firewall administrator about opening up set of ports for your data transfer connections. You might also consider placing your data server outside the firewall - an example of this is the Science DMZ architecture. This has the added benefit of avoid potential performance issues caused by the firewall.

For More Information

 

Recommended: Globus GridFTP client firewall information

How to specify port ranges for a GridFTP server

You can specify the ports that the GridFTP server uses by editing these files:

   /etc/grid-security/sshftp
/etc/gridftp.conf

Modify GLOBUS_TCP_PORT_RANGE to the ports you want. For example:

   GLOBUS_TCP_PORT_RANGE=50000,50050

To specify which ports are used by the client, you can modify the file:

   /usr/share/globus/gridftp-ssh

Look for the line:

   /usr/bin/ssh $port_str $remote_host $remote_program 

use something like this instead:

   /usr/bin/ssh $port_str $remote_host GLOBUS_TCP_PORT_RANGE=x,y $remote_program