scp and sftp

General Recommendations

In a Unix environment scp, sftp, and rsync are commonly used to copy data between hosts. While these tools work fine in a local environment, they perform poorly on a WAN. The openssh versions of scp and sftp have a built in 1 MB buffer (previously only 64 KB in openssh older than version 4.7) that severely limits performance on a WAN. Even though rsync is not part of the openssh distribution, rsync typically uses ssh as transport and is therefore subject to the limitations imposed by the underlying ssh implementation. DO NOT USE THESE TOOLS if you need to transfer large data sets across a network path with a RTT of more than around 5ms.

A patch to fix this problem is available from the Pittsburgh Supercomputer Center. This patch makes it possible to optimize single stream performance on a WAN. However, to fully optimize bulk data transfers over a WAN we recommend using one of the parallel stream tools such as GridFTP.

sftp: Secure File Transfer Program

As described above, don't even consider using this program for WAN transfers unless you have installed the HPN patch from PSC. But even with the patch, SFTP has the annoying characteristic of layering yet another flow control mechanism on top of everything else. By default, sftp limits the total number of outstanding messages to 16 32KB messages. Since each datagram is a distinct message you end up with a 512KB outstanding data limit. You can increase both the number of outstanding messages ('-R') and the size of the message ('-B') from the command line though.

Sample sftp command for a 128MB window:

sftp -R 512 -B 262144 [email protected]:/path/to/file outfile