Data Transfer Performance Expectations
Determining Performance Expectations
Knowing what performance to expect from your data transfer tool can be tricky, as the bottleneck might be the network, the disk, or the CPU. You will first want to run memory to memory tests to isolate network performance from disk performance.
Here are some steps to help guide this testing.
- Install a perfSONAR Toolkit, or one of the perfSONAR Bundles, and establish regular testing to the remote end of where you will be transferring data. If the other end does not have a perfSONAR node, ask them to set one up.
- Over the course of a week, evaluate the performance between the facilities. Node that if both ends are tuned and using perfSONAR tools, the numbers they report will represent the average network performance for a short test, using a single stream of TCP. This will form the baseline for what the network is capable of.
- Data transfer tools normally use parallel streams, so they could in fact use more of the available network resources, but they also must read and write from physical media. Without tuning, this can become the bottleneck.
- Pick the tool you will be testing, and ensure it is installed and configured on both ends. It is recommended that you follow the guidelines for tuning your data transfer node.
- Invoke the tool such that reading is from /dev/zero on one side, and writing is done to /dev/null on the other. This will not use the disk on either end, and gives us another important baseline: how well the tool can perform over the network using the hosts. We would expect this number to be the same as, or better than, perfSONAR. If it is worse, read the documentation for the tool to ensure it is tuned properly.
- Invoke the tool again and test read performance independent of write performance:
- Test reading from the disk (e.g. a test file you select or make yourself) on one end, and write to /dev/null on the other. This will test the read performance from one of the hosts, but ignore the write performance on the other side.
- Read from /dev/zero on one end, and write to the filesystem on the other. This will test the write performance from one of the hosts, and ignore the read performance on the other side.
- Finally, test reading and writing a file to and from each host.
- The results from the previous steps may all be different, by structuring the tests in this manner you can identify areas that are in need of improvement either from tuning the software and hardware, or due to other factors such as the network.