A small script to help organize your torrent downloads

A couple of days ago I wrote a small housekeeping script (in Python), that lists stale torrent data. I use this to help clean up the directory that I let my torrent application store the torrent data to. The script will list those files that are not part of the torrents loaded by a torrent application, but do occur in the given download directory. I find this useful to keep track of which files I am still up- or downloading, and which I am not. Those files I might decide to remove, or move to a different directory.

Finding out which torrents are in use by the torrent application turned out to not be that difficult for Transmission and Deluge, as they keep a directory with current torrents. They do this regardless of if you used a magnet link or an actual torrent file. To find out which files belong to torrent files, one will need to read the torrent files. As is shown on the bittorrent.org website, torrent files use a specific encoding, which is called bencoding. As it turns out, Fredrik Lundh published a decoder in August 2007, which was very useful to me.

My script will list each file, with their full paths. Example usage: python listStaleTorrentData.py /home/user/.config/deluge/state /home/user/downloads/ python listStaleTorrentData.py /home/user/.config/transmission/torrents/ /home/user/downloads/ The script will work with any torrent application that uses a directory with torrent files to store it's state. Note that only the one torrent application should be using the download folder to store files, because only that application's known torrents will be checked.

If you wish to pass this list to a command, such as the rm command on your UNIX(-like) operating system, you may have to tweak the output a bit. If the resulting files don't contain spaces and only contain those characters in the basic Latin block that are allowed by your filesystem, you can probably pass the output to rm by doing something like | xargs rm, provided that your platform has a utility such as xargs. If your files do contain spaces, you will have to tweak the output such that quotes are added. Stack exchange has you covered on that front.

You may notice on that page that the accepted answer uses a null character as a filename separator. Not every script or application accepts each character as a seperator, but there is a good reason to use that separator if you can. If you are using rm and xargs you can separate filenames by a null character. This will prevent one silly but dangerous vector of attack, which is filenames with a newline in them. rm and xargs have commandline arguments that you can use to indicate that their input is null separated data. The null character can never appear in filenames (at least not to my knowledge).

Regarding filenames with a newline in them, take a look at the following to see why they might be problematic (use a fresh, empty directory!): $ touch 'a b' $ touch 'b' $ ls a?b b $ ls | xargs rm rm: cannot removea': No such file or directory rm: cannot remove b': No such file or directory $ ls a?b The rm command tried to remove the files a and b, after encountering a[newline]b. Of course a never existed, but b did, and it got removed. If the characteres after the newline are a valid path, that file might be deleted if the file permissions allow it.