I've now finally gotten around to adding PostgreSQL support to Adamanteus! It wasn't really difficult, just hard to find time to do with everything that's been going on lately (quite busy at work and with the new house). The usage of Adamanteus hasn't changed at all, now you can just specify 'postgres' as your backend rather than just 'mongodb' or 'mysql'.
There is one slight caveat, however: the pg_dump utility does not allow you to provide a password non-interactively. For the time being, at least, that means that if you're using Adamanteus to back up a PostgreSQL database you can't specify a password (it will throw an exception if you do). The solution to this is to either 1) set up a read-only passwordless user for running backups, or 2) set up a .pgpass file on the machine from which you intend to run your backups (documentation here). I recommend the .pgpass option, it's quick and easy.
Adamanteus 0.5 is available from both bitbucket and PyPi.
Backing up databases is one of those things that I've always felt could be done in a better way. Traditionally I've done it with a simple shell script that used mysqldump or pg_dump to dump my database to an SQL file named using a timestamp, compress it, and maybe scp it off to some remote server for redundancy. This approach works just fine, except that I recently took a look at my backup directory for a project using that setup only to discover that there were nearly 5000 backup files taking up 11 GB (and this is using bzip2 to compress them!). Obviously not an optimal situation, especially considering that really very little changes from backup to backup, and it's quite possible that nothing changes at all for some of them. It simply makes no sense to store an entire dump of your database every single time!
Fortunately, this is a very familiar situation that we've got advanced tools to handle: version control systems. So I decided to write a little program to replace my shell script that would use a modern, advanced version control system to provide a much more reasonable solution. What I came up with was Adamanteus, a command line program written in Python that allows you to back up your database into a mercurial repository. It currently supports MongoDB and MySQL, and I plan on adding PostgreSQL support this weekend.
Using Mercurial immediately solves basically all the problems with my original approach. It stores diffs rather than full files, meaning you aren't wasting space with a lot of duplicate information. It also handles compression transparently keeping the file sizes down even for the diffs. Plus, because Mercurial is a distributed version control system it's very easy to provide redundancy by pushing and pulling to and from remote repositories. (Pushing/pulling to/from remote repositories isn't currently implemented, but that's also in my plans for this weekend.)
The project is far from complete, but I think it's sufficiently far developed to release as 0.1. Plans for the 1.0 release include:
- PostgreSQL support
- The ability to restore your database from a particular revision in the repository
- automated cloning/pushing/pulling of the repository
- Integration with Django as a management command
I think this is actually pretty close and it probably won't take too long for me to implement all of those, so hopefully I'll be able to push out a 1.0 release very soon. The one other issue holding up 1.0 is that I'd like to wait for MongoDB 1.5 which will bring
mongoxport functionality in line with
mongodump which is what I'm currently using. The issue here is that mongodump produces binary data files which don't play quite as nice with version control and lose you the advantage of only storing diffs. Mongoexport will export JSON or CSV files, which
will allow it to take full advantage of Mercurial,
but until 1.5 there's no easy way to use mongoexport to dump all the collections in a database which is the default behavior for mongodump.
Anyway, I'm definitely looking forward to some feedback on this project, as I suspect it could be quite useful to many people. Contributions are always welcome as well!
I've been nothing but impressed with the service I've got from WebFaction and the reliability I've gotten from their servers. I even had a very high traffic site run on a WebFaction shared account without a hitch.
Today, however, we got a first hand look at the downside of a shared server. If you're a webfaction customer you should be (no really, you should) subscribed to their status blog It's a great way to be kept up to date on any issues that might affect your site. The most recent issue has to do with the MySQL server on web49: the server that we happen to be using to develop a very large project. As you can see from reading the entry, the problem appears to have been caused by a corrupted database table (not one of ours) which was causing some unreliability with the database server (our Django based site was intermittently unable to connect to the database and, when it could, intermittently unable to authenticate). Though they thought they had it fixed, the problem returned and while they're attempting to fix it for good they've rolled back the entire database server to a known good backup.
This is a good approach as it means that everyone should still have most of their data in the meantime. Unfortunately, we happen to be in the middle of populating the database with the data we need to go live in the near future. Rolling back the database even by a day means that we've lost a ton of work. We should get it all back once the problem is fixed, but of coruse that means that we have to put the work on hiatus until the problem is fixed to avoid versioning issues.
This right here is the perfect illustration of why a dedicated server is a good idea. Yes, a shared server might be able to support your site. But it also leaves you vulnerable to the actions of the people you're sharing the server with. If someone else does something stupid that corrupts a database table on a server that they share with you suddenly you stand to lose a lot of work. If someone has a poorly written app that somehow manages to crash the server or even just eat up all the RAM, your site goes down. With a shared server you simply don't have the security of knowing that your site is stable and secure even if you trust your hosting company and you trust your code.
That security is what you're paying for when you get a dedicated server over a shared one.
View Comments
Tags:
dedicated,
dedicated server,
django,
downtime,
hosting,
mysql,
security,
server,
shared,
shared server,
stability,
uptime,
webfaction
I've been using WebFaction for my hosting for a while now, and have been extremely pleased with them. In addition to the fantastic service I've received, I've been very impressed with the intelligent way they have their servers setup. They've clearly done a lot of work to make things as modular as possible which makes it insanely easy for me to run multiple sites with very different requirements seamlessly on the same server.
Basically what they've done is segment out all of your different websites into 'applications'. Each application in represented as a directory in your ~/webapps/ directory, and is essentially a self-contained environment with it's own apache instance, and, in the case of a Django app, it's own $PYTHONPATH. The end result is that even though all the websites are being stored and run from within my home directory, they're entirely modular, can have different, or different versions of the same, dependencies installed, and can be shut down and restarted independently of one another. On top of all this is a fantastically simple custom web-based control panel that I'm pretty sure is built with Django.
I've been so impressed with how well this setup works, that I've decided to duplicate it on my home server for development purposes. Currently I do pretty much all my development work on my Gentoo Linux powered ThinkPad. To that end I've installed Apache, MySQL, PostgreSQL, SQLite, Python, PHP, &c.; to allow me to mimic the live sites as closely as possible and to allow me to continue working when I don't have internet access (such as when I'm flying or visiting Jessi's family out beyond the reach of broadband). This works very well, but as I'm just using a basic Apache install, without any VirtualHosts, it's not nearly as flexible and means I can really only work on a single site at a time with some work necessary to switch back and forth between projects. Of course most of the time I just use Django's built-in development server when working with Django, but I do end up relying on Apache sometimes, and I'd like to set up my home server as a more complete development environment for both myself and some friends I can grant VPN access to. So to that end I've been looking into WebFaction's setup with the idea of re-implementing it myself.
Turns out it's pretty simple. Simple enough that I almost feel like I should have thought of it myself. Basically, WebFaction's setup scripts create a new 'app' in your ~/webapps/ directory, and populate it, most importantly with a copy, owned by your user, of the Apache executable, some scripts to start, stop, and restart that executable, and an httpd.conf file that sets the (in the case of a Python-based app) $PYTHONPATH variable to include a ~/webapps/yourappname/lib/python2.5/ directory allowing each site to maintain it's own dependencies independently (you can also put things in your ~/lib/python2.5/ for global dependencies if you want). Oh, each application also gets it's own copies of the necessary Apache modules to the same effect. Each application's Apache instance(s) is set up to listen to a different (non-80) port. The end result of this is an extremely simple, extremely modular setup that works fantastically.
Obviously I've left out a step here. If each Apache isntance is listening to a different, non-80, port, how does your traffic get to your actual site? This is the one part that I can't really just peek into the configuration files for, because it doesn't (as far as I can tell, which makes sense) live on the same server as my sites. I assume that what's happening is that WebFaction's name servers are simply pointing requests to (for example) joshourisman.com:80 at my.webfaction.server:portnumber. Again, a simple, yet elegant solution that allows for easy customization and expansion.
I haven't yet tried to implement this setup myself (I first want to move my server from FreeBSD to Linux (which now that I'm using full-time again I'm just much more familiar with), but there's nothing about it that's particularly tricky. Really, the routing is probably going to be the hardest part, but I'm planning on replacing our rather lackluster TrendNet wireless router with a Linux box which will give me much greater control and (hopefully) better reliability.
Yes, I've managed to import the old posts from my blog! It was pretty easy to do, complicated only by the fact that I initially accidentally pulled the data from a different, older WordPress blog that I deleted some years ago but apparently still had the MySQL databases for.
At any rate, all my posts are now once again accessible, and any old links to them should still work. Comments have not yet been imported, but that's the next step. For now everything should be working as expected, but please let me know if you encounter any errors or problems.