Skip to content. | Skip to navigation

Sections
Personal tools
What is this?
Hi, my name is Tom Lazar and I'm a Plone and Zope developer based in Berlin, Germany and this is my personal and professional (no big difference, really...) website.
 

Backing up Zope

Filed Under:

Because "If you ain't got at least one backup you ain't got nothin'!"

Soooo... backups. Yeah... touchy subject. But... my clients got it written in their contract ;-) And after polishing my Zope setup tonight I thought I might as well blog about it...

By default Zope uses its own database, the infameous ZODB which in turn saves its data by default to one large binary file, usually named Data.fs. So, in theory, all you need to do is to back up that file and you're set. However, to guarantee that the copy of that file is consistent you would need to shut down Zope (or ZEO) while creating the copy - which is ugly, of course. To avoid that, there already exists a tool for creating so-called "hot backups" of those pesky Data.fs named repozo which is part of every Zope installation.

The path I've chosen then is to a) create local hot backups using repozo and then to simply copy those backups onto an external machine using rsync. Since repozo also handles incremental backups I decided to utilize that feature, as well, deciding on daily incremental backups and weekly full backups. Since this should all work without manual user intervention I would also need to install a cronjob on both the Zope host, as well as on the external backup machine. While I won't go into the details of setting up access rights on those machines, I will mention that I've created an extra user named zope-backup on both machines and installed an identical private SSH-key on both machines thus giving the zope-backup user on the external machine read-access to the local backups using that key.

The script to sync the files that repozo created on the Zope server to the external machine was basically just a wrapper around a single rsync call and was thus handled with a simple one-line shell script:

/usr/local/bin/rsync -vaxp --delete -e 'ssh -i /home/zope-backup/.ssh/zope-backup' zope-backup@$ZOPEHOST:/opt/zope/backups/* ~/$ZOPEHOST/instances/

(where $ZOPEHOST is the FQDN of the zope server.) That script is then called via a nightly cronjob.

The non-trivial bit was to create the local full and incremental backups that the above script would copy. Naturally, I wrote it in python ;-) This had the added advantage that I wouldn't have to call repozo externally but could rather simply import it and call its methods directly. I added a parameter to allow chosing between performing a full or incremental backup and then created the following cronjob on the zope server:

SHELL=/bin/sh 
LOGFILE=/var/log/zope-backup
PYTHON=/usr/local/bin/python
PYTHONPATH=/opt/zope/current/lib/python/:/opt/zope/current/bin/

15 1 * * * $HOME/bin/backup-all-zopes-locally.py incremental >> $LOGFILE
@weekly $HOME/bin/backup-all-zopes-locally.py >> $LOGFILE

Finally, here's the script doing the actual work (well, to be honest, repozo is doing the actual work!)

00001: #!/usr/local/bin/python
00002:
00003: # Author: Tom Lazar <tom@tomster.org>
00004: # Created: 2005/10/08
00005: # Licence: BSD
00006:
00007:
00008: from repozo import *
00009: import datetime
00010: import os
00011: import shutil
00012:
00013: BACKUP = 1
00014: RECOVER = 2
00015:
00016: COMMASPACE = ', '
00017: VERBOSE = True
00018:
00019: REPOBASE = "/opt/zope/instances/"
00020: BACKUPBASE = "/opt/zope/backups/"
00021:
00022:
00023: class Options:
00024: mode = None # BACKUP or RECOVER
00025: file = None # name of input Data.fs file
00026: repository = None # name of directory holding backups
00027: full = False # True forces full backup
00028: date = None # -D argument, if any
00029: output = None # where to write recovered data; None = stdout
00030: quick = False # -Q flag state
00031: gzip = False # -z flag state
00032:
00033:
00034: def default_options():
00035: options = Options()
00036: options.mode = BACKUP
00037: return options
00038:
00039: def createDir(dir):
00040: """ creates a directory, but only, if it doesn't exist yet"""
00041: try:
00042: os.mkdir(dir)
00043: except OSError:
00044: pass
00045:
00046: def dirExists(dir):
00047: return os.access(dir, os.F_OK)
00048:
00049: def do_specific_backup(which=None, full=True):
00050: """
00051: - When performing a full backup we rename any existing backup directory and create a new, empty one,
00052: into which repozo will do its magic.
00053: - When renaming the backup, we delete any existing old backup to prevent infinite space usage.
00054: - Otherwise we just perform an incremental backup into the existing folder
00055: - After performing a full backup we intentionally don't delete the old backup: if we did, we would lose
00056: all incremental backups performed since the last full backup. I.e when performing nightly incremental
00057: backups and weekly full backups you will always have between seven and fourteen snapshots instead of
00058: just between one and seven.
00059:
00060: IMPORTANT: the `Data.fs` must(!) be located at `zeo/var/Data.fs` inside the instance directory
00061: """
00062:
00063: if which == None:
00064: return None
00065:
00066: outdir = BACKUPBASE + which
00067: olddir = outdir + "-old"
00068:
00069: # if the outdir doesn't exist yet, we can't perform an incremental backup, right?
00070: if not dirExists(outdir):
00071: full = True
00072:
00073: if VERBOSE:
00074: mode = "incremental"
00075: now = datetime.datetime.now()
00076: nowstring = now.strftime("%Y-%m-%d %H:%M")
00077: if full:
00078: mode = "full"
00079: print nowstring, ": Performing", mode, "backup of", which, "locally."
00080:
00081: # when performing a full backup, we move any existing old backup out of the way:
00082: if full:
00083: if dirExists(outdir):
00084: if dirExists(olddir):
00085: shutil.rmtree(olddir)
00086: os.rename(outdir, olddir)
00087:
00088: createDir(outdir)
00089:
00090: options = default_options()
00091: options.full = full
00092: options.repository = outdir
00093: options.file = REPOBASE + which + "/zeo/var/Data.fs"
00094: do_backup(options)
00095:
00096:
00097: def main():
00098:
00099: args = sys.argv
00100:
00101: # we backup all directories inside the REPOPATH:
00102: instancelist = os.listdir(REPOBASE)
00103: #FIXME: for some reason the following filter returns an empty list...?!
00104: #instancelist = filter(os.path.isdir, instancelist)
00105:
00106: full = True
00107: if len(args) == 2 and args[1] == 'incremental':
00108: full = False
00109:
00110: for instance in instancelist:
00111: do_specific_backup(instance, full)
00112:
00113:
00114: if __name__ == '__main__':
00115: main()

You can find a downloadable version of the script here.

Some corrections for line 93 ff

Posted by Anonymous User at Oct 12, 2005 10:55 AM

options.file = REPOBASE + which + "/zeo/var/Data.fs" if os.path.exists(options.file): do_backup(options)

args = sys.argv

# we backup all directories inside the REPOPATH: dirlist = os.listdir(REPOBASE) #FIXED: no longer returns empty lists! instancelist = filter(os.path.isdir, dirlist)

def main():

for the more shell-o-phil person...

Posted by witsch at Nov 08, 2005 04:33 PM

...here's a slightly shorter, well, shell script version, which roughly does the same as the above python script:

#!/bin/bash

set -e PYTHONPATH=/opt/zope/current/lib/python PATH=/opt/zope/current/bin:$PATH

for zodb in $( find /opt/zope/ -type f -name Data.fs ); do inst_dir=$( dirname $zodb | sed s,/var$,, ) inst_name=$( basename $inst_dir ) backup=/opt/zope/backups/$inst_name test -d $backup || mkdir $backup repozo.py -BvzQ -r $backup -f $zodb $* done

it can then be used in a cronjob like

  1. 1 * @weekly -F

of course, it's just another version and not in the least to say there's anything wrong with doing it in python, or that it's any better. just my 2 cents, since i just had to set up a zope backup solution as well...

huh?

Posted by witsch at Nov 08, 2005 04:34 PM

tom, maybe you could be so kind and set disable structured text for that post. well, if it's of any interest that is... :)

tag

Posted by Anonymous User at Jun 25, 2010 08:19 AM

One of the nice Gucci handbags features of subversion are its built-in command aliases, tiffany jewellery i.e. instead of having to type svn commit you can just use svn ci,Ed Hardy instead of propertyedit just type pe.

ed hardy

Posted by Anonymous User at Aug 26, 2010 01:05 PM