Wordpress has just informed me that a new update is available. It is important to keep software up to date (I dont want anyone hacking my blog!) but experience has taught me that it is equally important to make regular backups of your databases before committing to an update.
A good backup strategy minimises both the potential for data loss and the amount of resources required to execute and maintain the backups. For my purposes, I wish to backup my database every 2 hours. The problem, however, is that after a few weeks, this will amount to a large number of files and will become expensive as far as storage goes. So, my strategy will be in 2 parts.
This strategy means that at any point in time, I will have backups at 2 hour intervals for the preceding 24 hours and daily backups back to the point where I started running the backup script. You can adapt this strategy to your own needs, but for this blog it should be ample.
In order to make this script as small and simple as possible, I am going to use php's exec command. This allows us to run terminal commands from inside php. For security reasons, exec is often disabled on shared web hosts so make sure you have it enabled before continuing.
NOTE: This backup script could be written entirely using shell scripting but, because I am a lover of all things PHP (and my shell scripting skills are very much below par) I have decided to use PHP as much as possible.
So, lets get started. First, I created a simple config file to store the database name, username and password as well as the hour in which to perform the 'daily' backup. PHP has a really useful function called parse_ini_file which will load a set of preferences into an array. The configuration file is called nash.ini (this is what I have decided to call this little backup script) and looks like this (with some AmazonS3 stuff to come later):
[nash]
database = 'dbname'
username = 'dbusername'
password = 'dbpassword'
dailyBackupHour = 22
Now on to the php script itself (nash.php).
I used PHP's date functions to determine the year, month, day and hour.
// Determine year, month, day and hour
$year = date('Y');
$month = date('m');
$day = date('d');
$hour = date('H');
$yesterday = date('d', time() - 486060);
Using parse_ini_file, we can load the config file into an array called $config:
// Load configs
$config = parse_ini_file('nash.ini');
Next, build the filenames. The hourly backups will be stored in Hourly/{day}/Hourly-{hour}.zip where {day} is a directory which is the number of the day in the month, and {hour} is the hour in the day. So an hourly backup made at 22:00 on the 15th would look like this:
Hourly/15/Hourly-22.sqlSimples!
// Build hourly file file name
$fnameHourly = 'Hourly-'.$hour.'.sql';
$fnameDaily = 'Daily-'.$year.'-'.$month.'-'.$day.'-'.$hour.'.sql';
$basedirHourly = dirname(FILE).'/hourly';
$basedirDaily = dirname(FILE).'/daily';
In order to make this script as simple as possible, I have automated the directory creation
// Make sure base directories are present and if not create them
if (!is_dir($basedirHourly)) {
mkdir($basedirHourly, 0744);
}
if (!is_dir($basedirDaily)) {
mkdir($basedirDaily, 0744);
}
// Make todays directory if it does not exist
if (!is_dir($basedirHourly.'/'.$day)) {
mkdir($basedirHourly.'/'.$day, 0744);
}
The above will
Our directory structure is almost complete, all we need to do now is remove the backups that are older than 24 hours. We could use the exec command to do this, but its more fun in PHP. We can use the rmdir function to delete a directory, but it will only delete empty directories, so we need to individually delete all the files in that directory first. We can do this by using the glob function to list the contents of the directory, followed by the unlink do delete each file individually. Then we can delete the actual directory itself.
// Remove yesterdays directory if it exists
if (is_dir($basedirHourly.'/'.$yesterday))
{
// Directory exists so delete everything in it
foreach (glob($basedirHourly.'/'.$yesterday.'/*') as $filename) {
unlink($filename);
}
// Now delete the directory
rmdir($basedirHourly.'/'.$yesterday);
}
Finally, we need to build and run the mysqldump command
// Create backup
$cmd = 'mysqldump '.$config['database'].' -u '.$config['username'].' -p'.$config['password'];
$path = $basedirHourly.'/'.$day.'/'.$fnameHourly;
exec($cmd.' > '.$path);
// Is this the daily backup
if ($hour == $config['dailyBackupHour']) {
$path = $basedirDaily.'/'.$fnameDaily;
exec($cmd.' > '.$path);
}
The $cmd contains the base command to call mysqldump with the correct parameters, and the $path command contains the path to the output file.
Once the hourly backup has been completed, we check to see if this is the correct time for the daily backup, and if so then we edit the $path variable and set the destination to the daily backup location.
At this stage, we have a working backup script, that will read in configuration details from a config file called nash.ini and create an hourly backup which is stored in Hourly/{day}/Hourly-{hour}.sql. Now we can set up a cron job to automate running the script.
Open up a terminal on your server and type
crontab -e
The crontab should open up in your favourite editor. Depending on your server setup, you might already have some cron jobs setup.
Add the following to run the backup script every 2 hours at 55 minutes past the hour
55 /2 php /path/to/nash/script/nash.php
Note: If you run the backup script every 2 hours, make sure that you set the config for the hour during which you make the daily backup is an even number (e.g. 22) or the daily backups will never be triggered.
Congratulations, you now have a working backup script which is run as a cron job on your server.
Note: Debugging a cron job can be very difficult and frustrating so I recommend you test the script by navigating to the containing folder and typing
php nash.php
to make sure that it is working correctly before pulling your hair out because the cron job is not working. This will run the script in CLI mode and any errors will be outputted to the terminal. It is probably a good time to mention that incorrect permissions are often the culprit for scripts not executing correctly, so make sure that the user running the script has write permission to the containing folder.
The GZip integration requires gzip to be installed on your server, but this usually the case. It will change the output of the backups to Backup-Name.sql.gz. I would recommend using gzip as the resulting backup files will (depending on your database) will be about 10% of the size of non compressed backup files.
For AmazonS3 integration you will obviously need an AmazonS3 account. The script will send only the daily backups to AmazonS3 and will store them as Daily-{date}.sql.gz in a bucket named {unique-string}.nash-backup.daily. You must set the {unique-string} in the config file. This is because AmazonS3 requires bucket names to be unique and therefore not setting it would cause clashes between users of this script.
I have added some config items to nash.ini
[nash]
database = 'dbname'
username = 'dbusername'
password = 'dbpassword'
dailyBackupHour = 22
use_gzip = 'true'
use_s3 = 'true'
bucket_prefix = 's3_bucket_prefix_must_be_unique'
aws_key = 'your_aws_api_key'
aws_secret_key = 'your_aws_secret_key'
These should be pretty self explanatory. Your AWS keys can be accessed from the 'Security Credentials' section on the 'Account' page on aws.amazon.com.
Communication between this script and AmazonS3 is achieved through the AWS PHP SDK. The SDK makes it very simple to add AmazonS3 integration into the script.
// Now send to AmazonS3 if enabled
if ($config['uses3'] == 'true') {
requireonce dirname(__FILE).'/aws-sdk-for-php/sdk.class.php';
$s3 = new AmazonS3($config['aws_key'], $config['aws_secret_key']);
$bucket = $config['bucket_prefix'].'.nash-backup.daily';
// Check to see if bucket already exists
$exists = $s3->if_bucket_exists($bucket);
if (!$exists) {
$create_bucket_response = $s3->create_bucket($bucket, AmazonS3::REGION_US_W1);
if ($create_bucket_response->isOK()){
$exists = $s3->if_bucket_exists($bucket);
while (!$exists)
{
// Not yet? Sleep for 1 second, then check again
sleep(1);
$exists = $s3->if_bucket_exists($bucket);
}
}
}
// Add the file to the bucket
$s3->create_object($bucket, $fnameDaily, array(
'fileUpload' => $path
));
}</pre>
And thats it!
I will be maintaining the script on GitHub at https://github.com/guyht/nash</a>.
Installation
Either download the file as a compressed zip archive from https://github.com/downloads/guyht/nash/nash.zip</a>.
Or, fork/clone my git repository from https://github.com/guyht/nash</a>.
NOTE: The git repository contains the AWS PHP SDK as a submodule. Once you have cloned the repository, you need to run the following commands to update the submodule
cd nash
git submodule init
git submodule update
This will download the AWS PHP SDK and add it to your cloned git repository. You don't need to do this if you don't want AmazonS3 integration. For more on submodules see http://book.git-scm.com/5_submodules.html</strong></a><strong>.</strong>