I've now finally gotten around to adding PostgreSQL support to Adamanteus! It wasn't really difficult, just hard to find time to do with everything that's been going on lately (quite busy at work and with the new house). The usage of Adamanteus hasn't changed at all, now you can just specify 'postgres' as your backend rather than just 'mongodb' or 'mysql'.
There is one slight caveat, however: the pg_dump utility does not allow you to provide a password non-interactively. For the time being, at least, that means that if you're using Adamanteus to back up a PostgreSQL database you can't specify a password (it will throw an exception if you do). The solution to this is to either 1) set up a read-only passwordless user for running backups, or 2) set up a .pgpass file on the machine from which you intend to run your backups (documentation here). I recommend the .pgpass option, it's quick and easy.
Adamanteus 0.5 is available from both bitbucket and PyPi.
Backing up databases is one of those things that I've always felt could be done in a better way. Traditionally I've done it with a simple shell script that used mysqldump or pg_dump to dump my database to an SQL file named using a timestamp, compress it, and maybe scp it off to some remote server for redundancy. This approach works just fine, except that I recently took a look at my backup directory for a project using that setup only to discover that there were nearly 5000 backup files taking up 11 GB (and this is using bzip2 to compress them!). Obviously not an optimal situation, especially considering that really very little changes from backup to backup, and it's quite possible that nothing changes at all for some of them. It simply makes no sense to store an entire dump of your database every single time!
Fortunately, this is a very familiar situation that we've got advanced tools to handle: version control systems. So I decided to write a little program to replace my shell script that would use a modern, advanced version control system to provide a much more reasonable solution. What I came up with was Adamanteus, a command line program written in Python that allows you to back up your database into a mercurial repository. It currently supports MongoDB and MySQL, and I plan on adding PostgreSQL support this weekend.
Using Mercurial immediately solves basically all the problems with my original approach. It stores diffs rather than full files, meaning you aren't wasting space with a lot of duplicate information. It also handles compression transparently keeping the file sizes down even for the diffs. Plus, because Mercurial is a distributed version control system it's very easy to provide redundancy by pushing and pulling to and from remote repositories. (Pushing/pulling to/from remote repositories isn't currently implemented, but that's also in my plans for this weekend.)
The project is far from complete, but I think it's sufficiently far developed to release as 0.1. Plans for the 1.0 release include:
- PostgreSQL support
- The ability to restore your database from a particular revision in the repository
- automated cloning/pushing/pulling of the repository
- Integration with Django as a management command
I think this is actually pretty close and it probably won't take too long for me to implement all of those, so hopefully I'll be able to push out a 1.0 release very soon. The one other issue holding up 1.0 is that I'd like to wait for MongoDB 1.5 which will bring
mongoxport functionality in line with
mongodump which is what I'm currently using. The issue here is that mongodump produces binary data files which don't play quite as nice with version control and lose you the advantage of only storing diffs. Mongoexport will export JSON or CSV files, which
will allow it to take full advantage of Mercurial,
but until 1.5 there's no easy way to use mongoexport to dump all the collections in a database which is the default behavior for mongodump.
Anyway, I'm definitely looking forward to some feedback on this project, as I suspect it could be quite useful to many people. Contributions are always welcome as well!
My friend Sebastian Celis recently posted on his blog about a zsh prompt for Git users. Basically, it's a set of scripts for ZSH that allow it to display the current status of the Git repo you're currently in. Very cool stuff, but unfortunately I don't use Git (very often), and instead use Mercurial for most of my projects. So I decided to modify it to work with Mercurial.
Very little has changed from his Git version (in fact, in most files it was a simple s/git/hg/), so I'm not going to go over how it all works. If you want to know that, you should read his original blog post. Instead, I'm just going to link to my bitbucket project for it: Mercurial for ZSH.
It is, at this point, a pretty half-assed port. There's still some work to do to fine-tune it for Mercurial, but it works. Another thing I'm interested in doing is seeing if I can get it to auto-detect what VCS is being used for the current directory and act accordingly so that it doesn't have to be limited to either Merurial or Git (which goes along nicely with another project that I'm working on and will hopefully be able to write about soon). But, half-assed or not, I think it may be useful to anyone out there using both ZSH and Mercurial (or any any VCS, if you want to fork the code again).