BitTorrent for Clusters
When staging large input files to the many nodes of a cluster, the traditional network copy approach (scp, rcp, NFS) quickly overwhelms the machine holding the original copy of the input file. BitTorrent, a peer-to-peer file distribution protocol is naturally suited for file distribution when the file is large, and many clients want the file simultaneously - precisely the scenario when a staging large input files for a distributed cluster job.
This infrastructure makes it possible to simultaneously distribute 40 copies of a 4Gb file to the nodes of a Linux cluster without bringing down the local network or the submit machine.
- Download the BitTorrent source from the SourceForge CVS
repository. The patch assumes version 3.4.2 (CVS tip as of December 2004).
cvs -d:pserver:email@example.com:/cvsroot/bittorrent login cvs -z3 -d:pserver:firstname.lastname@example.org:/cvsroot/bittorrent co BitTorrent
- Download the patch and support files from edwardslab.bmcb.georgetown.edu.
cd BitTorrent wget http://edwardslab.bmcb.georgetown.edu/software/downloads/script.tar.gz
- Unpack patch and support files, and apply patch.
gunzip -c script.tar.gz | tar xovf - patch -p0 < script.patch
- Install in suitable location. I use
$HOMEas the prefix argument, which installs files in
python setup.py install --prefix=$HOME
- Make sure your path contains the binary install location,
$HOME/binin my case; and the environment variable PYTHONPATH contains the library install location,
$HOME/lib/python2.4/site-packagesin my case.
The wrapper scripts are very simple. They make distributing files to a cluster as simple as using scp.
- btinit.sh [ -D n [ -E ] ] file1 ... fileN
- Starts the BitTorrent tracker server, if necessary, and a single seeding client per file. A 4-tuple of information is output for each input file: filename, server, port, and info-hash. Usually run on the cluster job submission machine. Computes a checksum of the file to avoid creating multiple torrents for the same file. Will also handle directories, in which case the checksum is not computed - if a seeder is serving a torrent with the same name is found, it is killed; and if a torrent with the same name is found the torrent is recomputed. The option -D instructs the seeding client to terminate after n clients have downloaded the file or directory. The -E option instructs the tracker to terminate when no clients have contacted the tracker in the last 10 minutes. Use these options with caution - they have not been particularly well tested. The -E option, in particular, may terminate the tracker if only the seeding client is running, since the seeding client has a complete copy of the file or directory, and doesn't need to contact the tracker to find out where to get torrent chunks.
- bt.sh filename server port info-hash
- Downloads the file tracked by the server server on port port with hash info-hash to the local file filename. Must be idle for 1 minute before exiting, to give its peers a chance to ask for pieces of the file. Usually run by each node before running its job to download the required input files. Failure to download the file is indicated by non-zero exit status.
- btfini.sh file1 ... fileN
- btfini.sh -a
- Kills the seeding client for each listed file (or all files with -a) and removes the torrent file from the tracker's directory. If no torrent files remain to be tracked, the tracker is killed and all tracker files are removed from /tmp. Run this on the same machine as btinit.sh.
- Print a table showing various statistics for each of the torrents being tracked by the tracker. Run this on the same machine as btinit.sh.
- Initiate the torrent on genesub00 for the files nraa.fasta, nrdb.fasta and msdb.fasta.
genesub00> ls -l nraa.fasta nrdb.fasta msdb.fasta -rw-rw-r-- 1 nedwards nedwards 651374714 Nov 25 19:41 msdb.fasta -rw-rw-r-- 1 nedwards nedwards 1100528784 Nov 25 19:42 nraa.fasta -rw-rw-r-- 1 nedwards nedwards 760915396 Nov 25 19:45 nrdb.fasta genesub00> btinit.sh nraa.fasta nrdb.fasta Started torrent tracker server... Make torrent file for nraa.fasta... Make torrent file for nrdb.fasta... Started torrent uploader for nraa.fasta... Started torrent uploader for nrdb.fasta... nraa.fasta genesub00.umiacs.umd.edu 7878 3b89f9eb2b8ba2e464e7d2184b523f73896e5568 nrdb.fasta genesub00.umiacs.umd.edu 7878 538b7e44f5c53ef6806502589d0f291b9aa8efca genesub00> btstatus.sh name size complete dling dled transferr nraa.fasta 1.02GB 1 0 0 0B 3b89f9eb2b8ba2e464e7d2184b523f73896e5568 nrdb.fasta 725MB 1 0 0 0B 538b7e44f5c53ef6806502589d0f291b9aa8efca Total 1.73GB 1/2 0/0 0/0 0B genesub00> btinit.sh msdb.fasta nrdb.fasta Torrent tracker server already running... Make torrent file for msdb.fasta... File nrdb.fasta torrent already made and file hasn't changed Started torrent uploader for msdb.fasta... Torrent uploader already running for nrdb.fasta... msdb.fasta genesub00 7878 26a9afd2c998e94bc25f8798394e4b8eb3845225 nrdb.fasta genesub00 7878 538b7e44f5c53ef6806502589d0f291b9aa8efca genesub00> btstatus.sh name size complete dling dled transferr msdb.fasta 621MB 1 0 0 0B 26a9afd2c998e94bc25f8798394e4b8eb3845225 nraa.fasta 1.02GB 1 0 0 0B 3b89f9eb2b8ba2e464e7d2184b523f73896e5568 nrdb.fasta 725MB 1 0 0 0B 538b7e44f5c53ef6806502589d0f291b9aa8efca Total 2.34GB 1/3 0/0 0/0 0B
- Download the file nrdb.fasta on genesub01 and genesub02.
genesub01> bt.sh nrdb.fasta genesub00 7878 538b7e44f5c53ef6806502589d0f291b9aa8efca & genesub02> bt.sh nrdb.fasta genesub00 7878 538b7e44f5c53ef6806502589d0f291b9aa8efca & genesub00> btstatus.sh name size complete dling dled transferr msdb.fasta 621MB 1 0 0 0B 26a9afd2c998e94bc25f8798394e4b8eb3845225 nraa.fasta 1.02GB 1 0 0 0B 3b89f9eb2b8ba2e464e7d2184b523f73896e5568 nrdb.fasta 725MB 1 2 0 0B 538b7e44f5c53ef6806502589d0f291b9aa8efca Total 2.34GB 1/3 2/2 0/0 0B
- Wait for completion.
genesub01> ls -l nrdb.fasta -rw-rw-r-- 1 nedwards nedwards 760915396 Dec 21 16:10 nrdb.fasta genesub02> ls -l nrdb.fasta -rw-rw-r-- 1 nedwards nedwards 760915396 Dec 21 16:10 nrdb.fasta genesub00> btstatus.sh name size complete dling dled transferr msdb.fasta 621MB 1 0 0 0B 26a9afd2c998e94bc25f8798394e4b8eb3845225 nraa.fasta 1.02GB 1 0 0 0B 3b89f9eb2b8ba2e464e7d2184b523f73896e5568 nrdb.fasta 725MB 3 0 2 1.41GB 538b7e44f5c53ef6806502589d0f291b9aa8efca Total 2.34GB 3/5 0/0 2/2 1.41GB
genesub00> btfini.sh -a genesub00> btstatus.sh
There are a few things going on behind the scenes that are important to know about.
The btinit.sh creates torrents and various other files the
tracker server uses in
/tmp. All files used by a
particular tracker match the template
where UUUU is the username of the user running the tracker and PPPP is
the port number it is bound to.
Tracker Server Port
The tracker server needs to bind to some port to listen to requests
from clients. The temporary files in
/tmp are used to
indicate whether or not a port is currently in use by some user. The
btinit.sh script will search for an unused port before starting
the tracker server. The port used by the tracker server can be
set explicitly by setting the environment variable
Permitted IP Addresses
The tracker server and BitTorrent clients take a regular expression
argument which constrains the IP addresses that will be permitted to
contact the tracker server or client. Currently, the scripts use a
regular expression allows any valid IP address to connect to the tracker and downloading and seeding clients.
The regular expression can be
set explicitly in the bt.sh and btinit.sh scripts or by setting the environment variable
BTIPRE. Note however, that this security measure is only
as good as the firewalls' ability to block IP spoofing attacks.
The scripts assume the following typical unix/linux programs are available in the user's path: python, sh, ls, wc, whoami, uname, ps, cat, test (as '['), echo, sed, kill, sleep, rm, wget, mkdir, cksum, expr, awk, fgrep; and that they behave in relatively standard ways.