15 Feb

Backups, Automated and Off Site

One of the biggest issues in running a server1 is making sure if everything disappears you can be up and running as quickly as possible. So how do I do it?

Simple answer is I use a cron job that runs every day and does daily, weekly and monthly database and file system backups and then pushes those to Amazon S3. I rolled my own bash script to perform the backups and after a few months of both testing and improving it’s ready to be shown off.

The script is extremly simple:

  1. Import config settings from a file
  2. Dump MySQL Databases, gzip and move the file to your backup folder
  3. Dump PostgreSQL Databases, gzip and move the file to your backup folder
  4. Dump MongoDB Databases, gzip and move the file to your backup folder
  5. Tar and gzip the local webroot and move the file to your backup folder
  6. Delete daily backup files older than 7 days from the backup folder
  7. If Monday
    1. Copy just created database and webroot backups to be weekly backups
    2. Delete weekly backup files older than 28 days from the backup folder
  8. If First of Month
    1. Copy just created database and webroot backups to be monthly backups
    2. Delete monthly backup files older than 365 days from the backup folder
  9. Use S3 Tools to essentially rsync the backup folder with an Amazon S3 Bucket

It’s clean, quick and above all has worked without fail for several months now. The slowest part of the process is uploading the files to S3 which has never taken that terribly long. It’s also repeating the mantra from my earlier post of “tar it then sync”.

This method is simple and it seems to work great for most single server setups. I haven’t optimized the database dumps, mainly because that is highly dependent upon your particular use of each. If you have multiple servers or separate database and web servers, why are you taking sys admin advice from me?

It’s available on GitHub: S3_Backup


  1. I use a virtual host from Linode for this site and a few others, they are great. 

15 Jul

Mark Story – My thoughts on the built-in php server

Earlier today I saw the announcement that PHP5.4 will have a built-in web server . I mentioned on twitter that I wasn’t too happy about the server being added. In the discussion that followed, I feel like I wasn’t able to properly convey my thoughts through tweets. I figured I might be able to better explain myself in a post.

I have mixed feelings about the built-in web server to be honest. Having a low effort web server is great for lowering the barrier to entry when building things with PHP. I can also appreciate the instantaneous feedback you get from a simple command line server, and not needing to fiddle with Apache or other more complex web servers. All of these things seem really great in isolation, and when you ignore some of the problems that it creates.

I can think of a few problems that the new command line server creates. First, while its intended for quick and dirty development, it will invariably end up being used as a production server somewhere. PHP already has a spotty track record with providing features meant to be helpful, but later become painful. I’m thinking of things like magic quotes and register globals. All of these features were at some level intended to make development easier. Instead they have become huge headaches, and are only now being removed.

via Mark Story – My thoughts on the built-in php server. I think he reached into my brain and said exactly what I was thinking.

24 Apr

O’Reilly Broadcast – The AWS Outage: The Cloud’s Shining Moment

So many cloud pundits are piling on to the misfortunes of Amazon Web Services this week as a response to the massive failures in the AWS Virginia region. If you think this week exposed weakness in the cloud, you don’t get it: it was the cloud’s shining moment, exposing the strength of cloud computing.

In short, if your systems failed in the Amazon cloud this week, it wasn’t Amazon’s fault. You either deemed an outage of this nature an acceptable risk or you failed to design for Amazon’s cloud computing model. The strength of cloud computing is that it puts control over application availability in the hands of the application developer and not in the hands of your IT staff, data center limitations, or a managed services provider.

The AWS outage highlighted the fact that, in the cloud, you control your SLA in the cloud—not AWS.

via O’Reilly Broadcast – The AWS Outage: The Cloud’s Shining Moment. The article even goes through the process of what it would take to permit you to have stayed up through Amazon’s recent downtime.

09 Apr

Facebook Engineering Blog – Building Efficient Data Centers with the Open Compute Project

A small team of Facebook engineers spent the past two years tackling a big challenge: how to scale our computing infrastructure in the most efficient and economical way possible.

Working out of an electronics lab in the basement of our Palo Alto, California headquarters, the team designed our first data center from the ground up; a few months later we started building it in Prineville, Oregon. The project, which started out with three people, resulted in us building our own custom-designed servers, power supplies, server racks, and battery backup systems.

via Facebook Engineering Blog – Building Efficient Data Centers with the Open Compute Project. This is pretty neat, but I’m going to go with Marco. The point is to demonstrate Facebook is working at a high scale much like Google and they aren’t opening up anything that actually makes them their real money.

06 Mar

Brent Ozar – RAID 0 SATA with 2 Drives: It’s Web Scale!

Before I start with this sordid tale of low scalability, I want to thank the guys at Phusion for openly discussing the challenges they’re having with Union Station. They deserve applause and hugs for being transparent with their users.

Today, they wrote about their scaling issues. That article deserves a good read, but I’m going to cherry-pick a few sentences out for closer examination.

via Brent Ozar – RAID 0 SATA with 2 Drives: It’s Web Scale!. Good discussion on scaling you database servers.