Automatic upload speed throttling

As I’ve discussed before, the broadband service we receive from Virgin is subject to caps on the traffic on the line during certain times of day (see this post I wrote). Virgin have a relatively complex system for throttling your speed based on the amount you download and upload within certain times. I have a script which performs a backup of my data from my server at home to my work computer using rsync over SSH. This means that I sometimes have a large amount of data to upload during the day. Since Virgin’s upload restrictions apply between 1500 and 2000 during which time you are permitted to upload a maximum of 1.5GB. What this means is that for five hours, the average transfer rate must not exceed 300MB per hour, which equates to a transfer speed of 83.3 kB/s. I searched for a way to limit my server’s upload speed and came across tc (which is a kernel extension which must be enabled in your .config). Slackware’s kernel config includes the necessary parts as modules, so when you use the tc command, they are automatically loaded.

I found a good example which described what I wanted to achieve (limit my uploads to 83.3 kB/s) here. Using the examples on that page, I wrote a little script to allow me to start and stop the limited uploads easily:

#!/bin/bash

# Script to throttle uploads during the times when Virgin
# won’t allow unlimited uploads.
#
# Those times are between 1500 and 2000 no more than
# 1.5GB must be uploaded, so the upload speed needs to be
# capped at 83.3kB/s.

maxRate=83.3kbps
burstRate=100000kbps
interface=eth0

clearRule(){
        # First things first, clear the old rule
        tc qdisc del dev eth0 root
}

makeRule(){
        # Now add the throttling rule
        tc qdisc add dev $interface root handle 1:0 htb default 10
        tc class add dev $interface parent 1:0 classid 1:10 htb rate $maxRate ceil $burstRate prio 0
}

listRules(){
        tc -s qdisc ls dev $interface
}

case “$1” in
        ‘start’)
                clearRule
                makeRule
        ;;
        ‘stop’)
                clearRule
        ;;
        ‘status’)
                listRules
        ;;
        *)
                echo “Usage: $0 {start|stop|status}”
        ;;
esac

You’ll note I’ve set a $burstRate value of 100MB/s. This is probably not necessary, but with the same $burstRate value as used in $maxRate, I was seeing significant slowdown in the responsiveness of the remote session; I hope this high burst rate will alleviate that slowdown.

I saved this script somewhere where root only could get it and then added the following cronjobs to root’s crontab:

# Add networking throttling between 1500 and 2000
00 15 * * * /root/throttle_uploads.sh start
00 20 * * * /root/throttle_uploads.sh stop

So far, the process appears to be working, insofar as my daily uploads from home to work have slowed to approximately 80kB/s during the times when Virgin monitor uploads.

Dynamic DNS services and ddclient

Since changing from O2 to Virgin broadband, I’ve had to hope my dynamic external IP address didn’t change too often. As it turns out, Virgin appear to have a very slow turnover of IPs for home customers, since I’ve had the same one since we changed. I wrote a little script to scrape my IP from http://checkip.dyndns.org every 10 minutes and save the output do Dropbox, but this seemed less than optimal.

I recently read about a nifty little client called ddclient. If your router doesn’t support updating a dynamic DNS service (such as www.dyndns.org) for you, then having a tool like ddclient do it for you is pretty handy.

It’s a little perl utility, which on Slackware requires some external dependencies (perl-IO-Socket-SSL, Net-SSLeay, libwww-perl, perl-html-parser and perl-html-tagset) available from SlackBuilds.org. Set up is pretty straightforward, and the default configuration file (/etc/ddclient/ddclient.conf) is pretty comprehensive. I found, however, it was a bit overwhelming, and the documentation on the website was more useful. In the end, my configuration looked like this:

# Stripped down version of the config file.
# Easier to manage, I think.

daemon=600                      # update every 10 minutes
syslog=yes                      # write to syslog
mail=root                       # all messages go to root
mail-failure=root               # failures sent to root too
pid=/var/run/ddclient.pid       # runtime pid file
# host setup
ssl=yes                         # use ssl when updating hostname
protocol=dyndns2                # using dyndns.org
use=web, web=checkip.dyndns.org/, web-skip=’IP Address’
#use=web                        # get ip from internet
login=myusername                # username
password=mypassword             # duh…
my.hostname.com                 # my hostname

I set the daemon to run every ten minutes and it uses www.dyndns.org to update the IP associated with my.hostname.com. I altered the line use=web to use=web, web=checkip.dyndns.org/, web-skip=’IP Address’ because I was getting errors with ddclient finding the checkip.dyndns.org address. In theory, this forces ddclient to parse the output from checkip.dyndns.org correctly.

In the end, ddclient does a very similar thing to my suboptimal approach, though it updates my dyndns account for me rather than me having to do so from the Dropbox text file to which my script saved my IP address.

Dropbox vs. SpiderOak

There was a hoohah recently on the internet with regards Dropbox and its Terms of Service and the manner in which they encrypt your data on their servers. As a result, I heard talk of alternative systems for syncing files across a few systems. The prerequisites were it needed to be cross-platform (Linux, Windows and Mac), preferably at least the same or better encryption that Dropbox has and it needed to be relatively easy to use.

Top of the list of alternatives was SpiderOak, a service principally aimed at online backup (more often known as storing your data in the cloud these days) but which also provides functionality to sync backed up folders across several machines.

First problem came in the installation of SpiderOak on Slackware. Their website provides a package for Slackware 12.1. At the time, Slackware was 32 bit only, so there’s no 64 bit package available. There are 64 bit packages available for a number of other distros, including Debian, Fedora, CentOS, openSUSE and Ubuntu. I decided to avoid the Slackware package since I wasn’t sure it would play nicely on a more recent (13.37) version of Slackware. Instead, I set about writing a SlackBuild script to repackage one of the other distro’s packages. In the end, I settled on the Ubuntu .debs.

I modified the autotools SlackBuilds.org template to unpack the .deb files instead of the more usual tarballs. Running the SlackBuild produced a package, but after I installed it I received the following error message:

Traceback (most recent call last):
File “<string>”, line 5, in <module>
zipimport.ZipImportError: not a Zip file: ‘/usr/lib/SpiderOak/library.zip’

I found the relevant file in /usr/lib and the file utility confirmed it was not a zip file. After much head scratching, it turned out that the strip lines in the standard SlackBuild significantly alter a number of files in the SpiderOak package. After removing those lines, the package built, installed and ran as expected. edit: If you’re looking for a copy of the SpiderOak SlackBuild, I’ve uploaded it here, with the doinst.sh, spideroak.info and slack-desc too.

The next step was migrating all my data from Dropbox to SpiderOak. SpiderOak allows you to define a number of folders to back up, unlike Dropbox where only a single folder can be synced. I created a new diretory (~/SpiderOak seemed fitting) and copied the contents of my Dropbox folder over to the new SpiderOak folder. I added this folder as a folder to back up in SpiderOak and let it upload the files.

I changed all the relevant scripts which used to dump data into my Dropbox folder to do so into the new SpiderOak folder instead. After setting up similar folders on my desktop (Windows) and my Mac (laptop), I was able to create a sync whereby all three versions would be synced with one another, effectively emulating the Dropbox functionality.

After a few days of dumping data into the folder, all seemed well. A few days after that, however, I started to get worried that my 2GBs of free storage on the SpiderOak servers was filling up rapidly. After some investigation, it became apparent that the manner in which SpiderOak works is slightly, though in my case, crucially different, and it relates to the way in which overwritten or deleted files are handled.

Dropbox stores old versions of files for up to 30 days after they have been overwritten or deleted. This allowed my frequent writes to not fill up all my space. Conversely, SpiderOak do not ever discard overwritten or deleted files. This is a nice feature if you accidentally delete an important file. However, for my purposes, it merely served to swallow my free 2GBs of storage in pretty short order.

It is possible to empty the deleted items for a given profile (i.e. each machine currently being backed up to the SpiderOak servers). Irritatingly, however, it is not possible to empty the deleted items for one of the machines from a different machine; you can view the deleted files and how much space they’re taking, but you can only delete them from the machine from which they were deleted. Since most of my files are created and deleted on my headless server, using VNC to open the SpiderOak GUI and emptying the files every couple of days quickly lost its appeal. I searched through the SpiderOak FAQs and documentation (such that I could find), and only found one option to do this automatically from a machine on the command line. The command is SpiderOak –empty-garbage-bin, though it is accompanied by this dire warning: 

Dangerous/Support Commands: 

Caution: Do not use these commands unless advised by SpiderOak support. They can damage your installation if used improperly.

So, for my purposes, the Dropbox approach of removing anything you’ve deleted or modified after 30 days is much more usable. Since I don’t keep anything particularly sensitive nor critical on there, the hoohah about the privacy and encryption aren’t much of a concern to me. After a month of SpiderOak, I was fed-up of the requirement for me to delete all the deleted files over VNC, and so I moved everything back to Dropbox. I suppose the lesson there is “if it ain’t broke, don’t fix it”.