2General

Want to know more about us? Visit 2general.com »

Selective restore from database backups with Django

Scream by eflon, on Flickr

When things go wrong and you lose your database, backups will help save the day. If the whole production database is corrupted or lost, it’s simple to throw it away and restore it in its entirety from the latest backup.

If data loss caused by a user error has remained unnoticed for some time, valuable data may since have been stored, and restoring a complete backup is not an option. In such cases it’s useful to be able to do a partial restore of one or more tables while keeping the rest of the database.

For these complex cases, I’m going to describe a technique for restoring a subset of the data in one or more relational database tables using Django.

Django 1.2 or later is required since this technique depends on having multiple database connections (see Django’s multi-db support). This opened the possibility to copy data between databases using the Django ORM.

To cut some corners, I’ll use SQLite as the database engine in the example project. The shell examples assume an Ubuntu server.

The example project

The victim of simulated data loss is an example project which uses Photologue for managing photos and albums. If you’d like to follow along, you can create the project in its own virtualenv:

$ django-admin.py startproject pets
$ cd pets
$ virtualenv --distribute --system-site-packages venv
$ . venv/bin/activate
$ pip install django-photologue

Here’s a minimal settings.py file for the project. You can replace the generated file with it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
DATABASES = {
    'default': {
        'ENGINE': 'sqlite3',
        'NAME': 'pets.sqlite'
    }
}

INSTALLED_APPS = ['django.contrib.auth',
                  'django.contrib.contenttypes',
                  'django.contrib.sessions',
                  'django.contrib.admin',
                  'photologue']

ROOT_URLCONF = 'pets.urls'

The urls.py file should look like this:

1
2
3
4
5
6
7
8
9
from django.conf.urls.defaults import patterns, include, url
from django.contrib import admin
admin.autodiscover()

urlpatterns = patterns(
    '',

    url(r'^admin/', include(admin.site.urls)),
)

Now initialize the database and run the development web server:

$ ./manage.py syncdb
$ ./manage.py runserver

Open http://localhost:8000/admin/ and log in with the admin credentials you entered for syncdb. Add several photos and create three galleries. Set the slug of one of the galleries to “pets”. This will be the gallery you’ll “accidentally” delete. Attach two photos into the “pets” gallery and some other photos into other galleries. Your list of galleries should look something like this:

../../../_images/galleries.png

and the “pets” gallery like this:

../../../_images/pets-gallery.png

When you’re done, stop the development server and make a backup of the database. With SQLite it’s as easy as making a copy of the database file:

$ cp pets.sqlite pets-backup.sqlite

Start the development server again, and delete the “pets” gallery.

Restoring a backup

Let’s connect your project to the temporary backup database. To avoid touching the production codebase, it makes sense to create a separate db_restore_settings.py file:

1
2
3
4
5
6
from settings import *

DATABASES['backup'] = {
    'ENGINE': 'django.db.backends.sqlite3',
    'NAME': 'pets-backup.sqlite'
}

Now you can fire up the shell and start poking at your production and backup databases. Remember to use the settings created above.

$ ./manage.py shell --settings=db_restore_settings

Photologue doesn’t remove Photo instances from the database nor image files from the disk when albums are deleted. So, we have one missing Gallery instance, and all its many-to-many relations to Photos have been deleted as well.

We can take a look at the Gallery model in different databases with the Queryset.using() method:

>>> Gallery.objects.using('backup').count()
3
>>> Gallery.objects.using('default').count()
2
>>> Gallery.objects.using('backup').filter(title_slug='pets').count()
1
>>> Gallery.objects.using('default').filter(title_slug='pets').count()
0
(When using the default database, the .using('default') method call can be omitted, but I have included it in the examples for clarity’s sake.)

Restoring a single model instance

At this point, we can simply retrieve the gallery from the backup and save it as a new instance to the production database.

In Django, a new copy of a model instance can be made by setting its primary key to None before saving. To specify that the copy should be saved into another database, the save() method accepts a using= keyword argument as stated in the multi-db documentation. Again, using= could be omitted here since the default database is specified, but I included it for clarity:

>>> backup_pets = Gallery.objects.using('backup').get(title_slug='pets')
>>> backup_pets.id = None
>>> backup_pets.save(using='default')

You can now verify in the Django admin interface that an empty “pets” gallery has indeed been resurrected.