Andrew's Forge

Upgrading Django (to 1.7)
Part II: Migrations in Django 1.6 and 1.7

Published by

Reviewed by Jacinda Shelly

Edited by Amy Bekkerman

The start of this article is meant to give beginners the chance to learn about migrations. The end of this article will then focus on migrations in Django 1.7 for developers of all levels.

Camelot and the Knights of the Round Table

In Part II of the Upgrading Django series, we will create a toy project based on Monty Python's 1975 film Monty Python and the Holy Grail. Our project will be titled camelot and will contain a single app named roundtable. Had he been able, this would have been the beginning of the website King Arthur might have commissioned and open-sourced for the people.

If you are not familiar with the specific differences between a Django project and a Django app, this article is likely too advanced for you at this point. In this case, I recommend that you first follow the official Django tutorial and then come back.

The goal of building the project will be to demonstrate changes in Django 1.7 by comparing it with Django 1.6. We will start by building two projects in parallel: one using Django 1.6 and one using Django 1.7. Both projects will run in Python 2 and Python 3. The only model in our roundtable app will be the Knight, and it will be identical in both Django 1.6 and Django 1.7.

Our central focus will be on migrations. We will thus start in Django 1.6 and create migrations with South, the de facto migration app. We will add example data to the database in keeping with the Monty Python film. At this point, Lancelot will run away with Guinevere, forcing us to modify our Knight model to account for his treason and demonstrating the basic use of schema migrations. We will then replicate this example in Django 1.7. This will not only highlight the differences in migrations but will demonstrate the App Loader and the Systems Check framework (not to mention what a massive jerk Lancelot really is).

Note that South 1.0 is not currently Python 3 compatible, due to a regression after South 0.8.4. Andrew Godwin, the creator of South, committed a fix to the issue on August first, but there is no formal release at the time of writing. As such, our Django 1.6 project will be run in Python 2, even though our code is compatible with Python 2.7 and Python 3.2+. Special thanks to Martijn Pieters for documenting the issue on Stack Overflow.

Django Code and Requirements

All of the code presented in this article is available in my git repository.

To run the code, both Python 2.7 and Python 3.2+ are necessary. The updj16 branch contains the Django 1.6 (and therefore Python 2.7) code, and the updj17 branch contains the Django 1.7 code (built using Python 3.3). The updj16 further necessitates the South package.

To make switching between the two branches easier, I highly recommend the use of virtualenv and virtualenvwrapper.

Note that you do not need to download the code. We will see all of it in the following sections. Should you wish to build your own repository by following the instructions provided, you are able to.

Creating the camelot Project

The first notable change to creating a new Django project with Django 1.7 is that the command django-admin.py can now be accessed via the alias django-admin, removing the .py (though the older version works as well). In Django 1.6 we will start our project by invoking the following from the command line:

$ django-admin.py startproject camelot

Whereas in Django 1.7, we may choose to invoke the following:

$ django-admin startproject camelot

The two camelot projects have exactly the same layout.

Important! I will refer to the root of the camelot project as /.

  • /manage.py
  • /camelot/
    • __init__.py
    • settings.py
    • urls.py
    • wsgi.py

The files in each project are almost identical, but the code in Django 1.7 has two minor changes:

  1. /camelot/settings.py sees the addition of the SessionAuthenticationMiddleware. New to Django 1.7, this middleware will invalidate existing sessions should a user change their password, forcing the user to log back in, as documented here.
  2. In /camelot/urls.py, the call to admin.autodiscover() has been removed, as Django now calls it automatically thanks to the new app-loading framework, as we shall discover shortly.

Preparing camelot for South in Django 1.6

Because we use South in our Django 1.6 project, it is recommended that we create our database before proceeding any further. We do so by adding 'south', to the INSTALLED_APPS list in /camelot/settings.py and then immediately invoking syncdb via manage.py. South will thus have a table in our database. By doing so, we also grant our apps with a 'zero' state, where they do not have any tables in the database. Essentially, we are starting with a clean database.

The call to syncdb will generate its regular output, with the following addition, courtesy of South:

Synced:
 > django.contrib.admin
 > django.contrib.auth
 > django.contrib.contenttypes
 > django.contrib.sessions
 > django.contrib.messages
 > django.contrib.staticfiles
 > south

Not synced (use migrations):
 -
(use ./manage.py migrate to migrate these)

Please observe that—unlike our Django 1.6 project—our Django 1.7 project does not require any extra steps to get started, but provides the same features as South, which we shall examine shortly. Furthermore, while still operational, the syncdb command is deprecated in Django 1.7.

Creating the roundtable App

In both our Django 1.6 and Django 1.7 camelot projects, we may create our roundtable app using the same command.

$ ./manage.py startapp roundtable

To tell the camelot project of the existence of roundtable, we add 'roundtable', to the list of INSTALLED_APPS in /camelot/settings.py for both projects.

The anatomy of the roundtable app in Django 1.7 is almost identical to the roundtable app in Django 1.6, with a single exception. The folder structure in Django 1.7 is listed below:

  • /camelot/roundtable/
    • __init__.py
    • admin.py
    • migrations/
      • __init.py__
    • models.py
    • tests.py
    • views.py

All of the files in each app are completely identical, but creating the app in Django 1.7 causes the addition of a directory named migrations. The existence of the __init__.py file indicates that this directory is a Python package. South will create this package for us in Django 1.6, but only after we create our first schema migration.

Creating the Knight model

With a project and app ready, our next step is to create a model. Having a model will allow us to interact with migrations in both Django 1.6 and Django 1.7. In roundtable/models.py, we create a very basic model called Knight:

from django.db import models

class Knight(models.Model):
    name = models.CharField(max_length=63)

    def __str__(self):
        return self.name

The code above is simple and will work perfectly in Python 3.2 or above but will lead to problems should we wish to run the code in Python 2.7. We must take three steps to ensure backward compatibility.

The first step is to ensure that Python is clear about what the file encoding is. While Python 3 assumes that files are UTF-8 by default, Python 2 does not. In keeping with PEP 263, we thus add # -*- coding: utf-8 to the top of our file (assuming you are saving your file in UTF-8, which you should be).

Our second step is to avoid surprises when switching between versions: we want Python 2.7 and Python 3 strings to work the same way. Native Python 2.7 strings are ASCII, while Python 3 strings are 32-bit unicode, which is far more desirable (as Unicode allows for over one million character symbols including Chinese and Latin accents, while ASCII is limited to 128). To force Python 2.7 strings to behave like unicode, we need only add the following import to the top of our file: from __future__ import unicode_literals.

Our last step is to enable proper representation of Knight objects. In Python 3, to provide a string representation to the model, we have declared and implemented the __str__ method. However, to represent an object as a unicode string in Python 2.7, we would instead need to implement the __unicode__ method (which does not exist in Python 3 because all strings are unicode). Django provides a decorator that will effectively use our implementation of __str__ to create a __unicode__ when running the project in Python 2.7. We thus import the python_2_unicode_compatible decorator and apply it to our class.

Our code, printed below, will now work in both Python 3.3 and Python 2.7.

# -*- coding: utf-8
from __future__ import unicode_literals
from django.db import models
from django.utils.encoding import python_2_unicode_compatible

@python_2_unicode_compatible
class Knight(models.Model):
    name = models.CharField(max_length=63)

    def __str__(self):
        return self.name

Understanding Migrations

With our baseline project in place, we now turn our attention to migrations. At this point you may be wondering what migrations are and why they are necessary. Those of you who are already familiar with migrations can skip to the next section.

This section is an excerpt from my book, Django Unleashed, currently in "Rough Cut" (draft) form. Please consider taking a look on Safari.

In Django, models and the database schema are reflections of each other. Any modification of one must result in the modification of the other. In a fully deployed team project, this can actually be quite tricky. If one developer makes a change to his or her local test database, he or she needs a way to share these changes with other developers and the various servers running the website(s) in a repeatable and version-controlled manner. If the change turns out to be in error, the developer should also have an easy way to undo his or her changes. Finally, if two developers are working on the same area of the code and make conflicting changes, it should be easy to recognize the issue and resolve the problem.

Migrations solve the problems above by providing a controlled, predictable system for altering a database. The typical workflow with Django is to

  1. Make a change to the model
  2. Generate a migration file
  3. Use the migration file to create/alter the database

The migration file contains the instructions to alter the database and to roll back (or revert) those alterations. The advantage of keeping these instructions in a file is that this file may be easily run (applied) and shared among developers. Migration files may be considered version control for the database, although the analogy is imperfect. The advantage of having these changes as files (and where the analogy breaks down) is the ability to store and share these files under a standard version-control system.

Migrations are a key part of almost every non-trivial Django project, as they will help you avert major headaches and save you time.

Migrations in Django 1.6 with South

The differences between Django 1.6 and Django 1.7 have so far been minimal. With migrations, the differences between the two versions become more apparent. In the interest of simplicity, the article will now focus entirely on our Django 1.6 project and then return to our Django 1.7 project.

Recall that in the section Preparing camelot for South in Django 1.6, we created the database before even generating the roundtable app or coding our Knight model. Accordingly, we must now alter the database to create a table for our Knight model. As detailed in the last section, we will first generate a schema-migration file and then use the file to alter the database. Remember that the schema-migration file automatically generated by South is simply a set of instructions detailing how to change the database in order to create the table associated with our Knight model.

South creates the schema-migration file when we invoke schemamigration via manage.py at the command line. We make sure to pass the name of our app and inform South that this is our first schema migration with the flag --initial.

$ ./manage.py schemamigration roundtable --initial
Creating migrations directory at '/roundtable/migrations'...
Creating __init__.py in '/roundtable/migrations'...
 + Added model roundtable.Knight
Created 0001_initial.py.
You can now apply this migration with:
    ./manage.py migrate roundtable

Before doing anything else, take a quick look at the migration file:

# -*- coding: utf-8 -*-
from south.utils import datetime_utils as datetime
from south.db import db
from south.v2 import SchemaMigration
from django.db import models


class Migration(SchemaMigration):

    def forwards(self, orm):
        # Adding model 'Knight'
        db.create_table(
            'roundtable_knight',
            (('id',
              self.gf(
                  'django.db.models.fields.AutoField')(
                      primary_key=True)),
             ('name',
              self.gf(
                  'django.db.models.fields.CharField')(
                      max_length=63)),
             ))
        db.send_create_signal('roundtable', ['Knight'])

    def backwards(self, orm):
        # Deleting model 'Knight'
        db.delete_table('roundtable_knight')

    models = {
        'roundtable.knight': {
            'Meta': {'object_name': 'Knight'},
            'id': ('django.db.models.fields.AutoField',
                   [],
                   {'primary_key': 'True'}),
            'name': ('django.db.models.fields.CharField',
                     [],
                     {'max_length': '63'})}}

    complete_apps = ['roundtable']

The file above tells South how to change the database. Starting with a new database (like the one we currently have), South will run the forwards() method to create the Knight table according to the model we coded. We can tell South to apply these changes to the database via the migrate command, optionally passing in the name of an app, as we do below. South will then change the database to reflect the changes in the schema-migration file.

$ ./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0001_initial.
 > roundtable:0001_initial
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

Observe the last line of the output: migrate has not added data to the database but comes with the ability to do so. This ability is not actually from South but from Django. When syncdb is invoked in a project without South, Django will load initial data fixtures, as above.

Creating Initial Data

A database means nothing without data, so let's take the opportunity to add some knights.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.bulk_create([
...     Knight(name='Bedevere'),
...     Knight(name='Bors'),
...     Knight(name='Ector'),
...     Knight(name='Galahad'),
...     Knight(name='Gawain'),
...     Knight(name='Lancelot'),
...     Knight(name='Robin'),
... ])
[<Knight: Bedevere>,
 <Knight: Bors>,
 <Knight: Ector>,
 <Knight: Galahad>,
 <Knight: Gawain>,
 <Knight: Lancelot>,
 <Knight: Robin>]

Consider that King Arthur may have wished the list of his Knights to be available not only on the public website and via API but also in the open-source project provided to his people. We cannot put a database in source control (it's a Bad Idea™). We must find another way to easily store and share the data.

Django provides the fixture system, which can dump database data into JSON, XML, and YAML files and then load this data back into any database. In addition, Django provides the initial data system, which will find a file named initial_data and automatically load the file into the database using fixtures when the database is created by syncdb. At first glance, this system seems perfect for our purposes.

We can create an initial_data.json file in the app specific fixtures directory using dumpdata.

$ mkdir roundtable/fixtures
$ ./manage.py dumpdata --indent=2 roundtable > \
>           roundtable/fixtures/initial_data.json

The contents of the file created above may be seen here. Now, developers setting up this project will automatically have all of the initial knights loaded into their database. Let's not take my word for it, though—let's see it in action.

Start by resetting our database. We delete the Knight table in our database using the migrate command.

$ ./manage.py migrate roundtable zero

The zero parameter refers to the starting state of the database. Because we created the database in advance of our roundtable app, this means that the roundtable_knight table created by our first migrate command will have been deleted. South achieves this simply by calling the backwards() method in the 0001_initial.py schema-migration file.

    def backwards(self, orm):
        # Deleting model 'Knight'
        db.delete_table('roundtable_knight')

The migrate command actually allows us to apply migrations selectively. The zero parameter is a keyword, but we can select which migrations to apply by passing in the number prepended to each migration file. Rather than calling $ ./manage.py migrate roundtable, we can append 0001 to the command to explicitly apply all migrations up to and including 0001_initial.py. This will call, as before, the forwards() method of the migration file.

$ ./manage.py migrate roundtable 0001
 - Soft matched migration 0001 to 0001_initial.
Running migrations for roundtable:
 - Migrating forwards to 0001_initial.
 > roundtable:0001_initial
 - Loading initial data for roundtable.
Installed 7 object(s) from 1 fixture(s)

Observe that in this instance, Django has taken the opportunity to automatically load the data in our initial_data.json file, as indicated by the last line of the output above.

Foreshadowing: We did not actually need to roll back the database to zero. South will insert initial data when migrate is invoked, even if there are no migrations. We shall see how this gets us into trouble in the next section.

Lancelot's Betrayal

As luck would have it, our model and database need modification as data changes. In honor of Lancelot, we need to add the following field to our Knight model.

# roundtable/models.py
class Knight(models.Model):
    ...
    traitor = models.BooleanField()

I am intentionally not using best practices in this code. Starting in Django 1.6, the BooleanField has a default of None. It is therefore recommended that developers provide a default value (True or False). I am intentionally not providing one.

Without South (and with a larger, production database), adding a new field to our model would put us in a pickle. Do we write custom SQL to add the column? What if we are running a different database locally than on production? With South, the way is clear. We simply create another schema-migration file, which will generate the commands to add (or remove, if rolling back) our new traitor field, and then use migrate to apply these to our database.

Our current situation also allows for a more fundamental understanding of migration files. Consider that South needs to figure out what changes we've made to our model. Specifically, it needs to know the original state of our models.py file in order to compare it with our new state. South cannot rely on the database; there is no telling what state it is in, or even if it exists. So how does South know what we've changed? It stores the entire model state in the migration file. The majority of 0001_initial.py is dedicated to this task:

    models = {
        'roundtable.knight': {
            'Meta': {'object_name': 'Knight'},
            'id': ('django.db.models.fields.AutoField',
                   [],
                   {'primary_key': 'True'}),
            'name': ('django.db.models.fields.CharField',
                     [],
                     {'max_length': '63'})}}

This state is referred to colloquially as the frozen model. When we use schemamigration on our roundtable app, South will compare /roundtable/models.py to the models dictionary (the frozen model) in 0001_initial.py to determine what changes need to be made, allowing it to create a second schema-migration file. However, that is not all it will do. It will also note that the new field we have added cannot be null, but we did not put a default value in the field itself. South needs a default value in case there is existing data in the database. Ignoring Lancelot for the moment, we will set the value to False, with the intention of dealing with Lancelot momentarily.

$ ./manage.py schemamigration roundtable --auto
 ? The field 'Knight.traitor' does not have a default specified, yet is NOT NULL.
 ? Since you are adding this field, you MUST specify a default
 ? value to use for existing rows. Would you like to:
 ?  1. Quit now, and add a default to the field in models.py
 ?  2. Specify a one-off value to use for existing columns now
 ? Please select a choice: 2
 ? Please enter Python code for your one-off default value.
 ? The datetime module is available, so you can do e.g. datetime.date.today()
 >>> False
 + Added field traitor on roundtable.Knight
Created 0002_auto__add_field_knight_traitor.py. You can now apply this migration with: ./manage.py migrate roundtable

With a proper schema-migration file in place, we can apply the new change to the database, which will create the traitor column in roundtable_knight and fill the existing rows with the equivalent of Python's False.

$ ./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0002_auto__add_field_knight_traitor.
 > roundtable:0002_auto__add_field_knight_traitor
 - Loading initial data for roundtable.
[Error Output Omitted]
Problem installing fixture [...]
[...] NOT NULL constraint failed: roundtable_knight.traitor

Unfortunately, attempting to run migrate fails. The reason is simple: South is trying to apply the data in /roundtable/fixtures/initial_data.json a second time. This does not work because the fixture does not provide data for our mandatory new traitor field.

Initial Data Fixtures and Data Migrations

The initial data system does not play well with South, and initial data fixtures are thus typically frowned upon. Consider that for initial data fixtures to work, we would need to keep the fixture up to date for every new migration. However, doing so would invalidate the use of these fixtures for all intermediary databases schemas. Initial data fixtures are thus undesirable when using schema migrations.

However, we still want to provide a project that automatically loads data. This goal is still achievable, but we must take a few steps back to get there. Our first task is to entirely remove the initial data dependency we have created.

We first roll back our database to a completely fresh start by invoking:

$ ./manage.py migrate roundtable zero

We may then delete the 0002_add_field_knight_traitor migration file.

To reset completely, we remove the traitor field from the Knight model. We should also rename the fixture file we have to prevent Django from loading it automatically when either syncdb or migrate are called.

mv roundtable/fixtures/initial_data.json \
   roundtable/fixtures/0002_add_knight_data.json

The name I have given the new migration file may seem cryptic at the moment, but it will make sense before the end of the section.

While it is possible to rename the fixture and remove the traitor field before rolling back the database, it is imperative that the 0002_add_field_knight_traitor migration file exist for the rollback. The order of operations for the rollback and migration file deletion is important.

We can then bring our database back to an empty state, with the roundtable_knight table created for our Knight model but with no data loaded.

$ ./manage.py migrate roundtable 0001
 - Soft matched migration 0001 to 0001_initial.
Running migrations for roundtable:
 - Migrating forwards to 0001_initial.
 > roundtable:0001_initial
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

Our migration no longer loads data, as desired.

What we want now is not a schema-migration file but a data-migration file. Whereas schema migrations alter the structure of the database, data migrations change the data in the database.

We can create a data migration file for South by invoking the datamigration command, passing in the app for which we wish to create the migration, followed by the name of the migration.

$ ./manage.py datamigration roundtable add_knight_data
Created 0002_add_knight_data.py.

If you examine the file generated by South, you'll see that it doesn't actually do anything. The forwards() and backwards() methods are empty. However, note that the frozen model is still included.

# roundtable/migrations/0002_add_knight_data.py
# -*- coding: utf-8 -*-
from south.utils import datetime_utils as datetime
from south.db import db
from south.v2 import DataMigration
from django.db import models

class Migration(DataMigration):

    def forwards(self, orm):
        "Write your forwards methods here."

    def backwards(self, orm):
        "Write your backwards methods here."

    models = {
        u'roundtable.knight': {
            'Meta': {'object_name': 'Knight'},
            u'id': (
                'django.db.models.fields.AutoField',
                [],
                {'primary_key': 'True'}),
            'name': (
                'django.db.models.fields.CharField',
                [],
                {'max_length': '63'})
        }
    }

    complete_apps = ['roundtable']
    symmetrical = True

The developer must now implement the methods in the skeleton data migration.

We could choose to implement forwards() such that it uses our fixture file.

# roundtable/migrations/0002_add_knight_data.py
from django.core.management import call_command
    ...
    def forwards(self, orm):
        call_command('loaddata', '0002_add_knight_data.json')

Hopefully, the name of our fixture now makes sense, as it shares the name of the data migration that uses it.

Should we roll back our database and apply all our migrations, we will now see the data being loaded manually as the fixture system produces output. Note, however, that the initial data loader at the very end loads no data.

./manage.py migrate roundtable zero
./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0002_add_knight_data.
 > roundtable:0001_initial
 > roundtable:0002_add_knight_data
 - Migration 'roundtable:0002_add_knight_data' is marked for no-dry-run.
Installed 7 object(s) from 1 fixture(s)
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

Our current method has the advantage of separating function and data, but, given the very nature of a data migration, could be deemed unnecessary. We could also generate the data directly in the migration file, thanks to the orm variable provided by South.

# roundtable/migrations/0002_add_knight_data.py
    def forwards(self, orm):
        orm.Knight.objects.bulk_create([
            orm.Knight(name='Bedevere'),
            orm.Knight(name='Bors'),
            orm.Knight(name='Ector'),
            orm.Knight(name='Galahad'),
            orm.Knight(name='Gawain'),
            orm.Knight(name='Lancelot'),
            orm.Knight(name='Robin'),
        ])

The use of the orm variable to access the Knight model is crucial. Consider that when the migration is applied, the Knight model may look completely different in roundtable/models.py or may not exist at all. When applying schema or data migrations, we are interested in the historical version of the model: the way the model was coded at the time the migration was generated. South builds this historical model using the frozen model and then provides it via the orm variable.

In the event that we roll back the database and apply our migrations again, note that the output does not tell us that data is being loaded.

$ ./manage.py migrate roundtable zero
$ ./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0002_add_knight_data.
 > roundtable:0001_initial
 > roundtable:0002_add_knight_data
 - Migration 'roundtable:0002_add_knight_data' is marked for no-dry-run.
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

Either of the two methods used above is a valid choice, but I happen to prefer the latter, as it centralizes the migration in a single location, removing the possibility that another developer might have misunderstood the purpose of the fixture (and accidentally deleted or modified it).

With our forwards() correctly coded, we need only implement the backwards() method.

# roundtable/migrations/0002_add_knight_data.py
    def backwards(self, orm):
        orm.Knight.objects.all().delete()

These methods allow us to move backwards and forwards between the first two migrations. Observe that in all of the previous examples, we returned to the zero state. This was a necessity. Without a working backwards() method, migrating from the 0002 state to the 0001 state would have left data in the database. Furthermore, every re-application of the 0002 migration from the 0001 state would have added duplicate data to the database. However, now that we have programmed backwards(), our migration will remove data during a rollback. This is easily verifiable.

$ ./manage.py migrate roundtable 0001
 - Soft matched migration 0001 to 0001_initial.
Running migrations for roundtable:
 - Migrating backwards to just after 0001_initial.
 < roundtable:0002_add_knight_data
 - Migration 'roundtable:0002_add_knight_data' is marked for no-dry-run.
$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.all()
[]

Coding backwards() should not be considered optional.

In conclusion, data migrations are to be used in favor of the initial data system.

Dealing with Lancelot

With a project that correctly loads our Knight data, we can return our attention to Lancelot. The steps we take here are identical to the ones before.

We first add a traitor field, still without the default field option.

# roundtable/models.py
class Knight(models.Model):
    ...
    traitor = models.BooleanField()

We then generate a schema-migration file with South and inform South to automatically set the traitor field to False.

$ ./manage.py schemamigration roundtable --auto
 ? The field 'Knight.traitor' does not have a default specified, yet is NOT NULL.
 ? Since you are adding this field, you MUST specify a default
 ? value to use for existing rows. Would you like to:
 ?  1. Quit now, and add a default to the field in models.py
 ?  2. Specify a one-off value to use for existing columns now
 ? Please select a choice: 2
 ? Please enter Python code for your one-off default value.
 ? The datetime module is available, so you can do e.g. datetime.date.today()
 >>> False
 + Added field traitor on roundtable.Knight
Created 0003_auto__add_field_knight_traitor.py. You can now apply this migration with: ./manage.py migrate roundtable

Recall that South will be using the frozen model from the previous migration file to automatically determine what has changed.

We can then apply the new migration.

$ ./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0003_auto__add_field_knight_traitor.
 > roundtable:0002_add_knight_data
 - Migration 'roundtable:0002_add_knight_data' is marked for no-dry-run.
 > roundtable:0003_auto__add_field_knight_traitor
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

We can now take the extra step of changing the data as we desire, labeling Lancelot the traitor he is. We first create another data migration file.

$ ./manage.py datamigration roundtable label_lancelot_traitor
Created 0004_label_lancelot_traitor.py.

We then implement the forwards() and backwards() methods such that Lancelot's traitor status is switched, as appropriate.

# roundtable/migrations/0004_label_lancelot_traitor.py
class Migration(DataMigration):

    def forwards(self, orm):
        lancelot = orm.Knight.objects.get(
            name__iexact="Lancelot")
        lancelot.traitor = True
        lancelot.save()

    def backwards(self, orm):
        lancelot = orm.Knight.objects.get(
            name__iexact="Lancelot")
        lancelot.traitor = False
        lancelot.save()

Prior to running our migration, we can verify that Lancelot is not considered a traitor.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.get(name__iexact="Lancelot").traitor
False

We then apply our new data migration.

$ ./manage.py migrate roundtable
Running migrations for roundtable:
 - Migrating forwards to 0004_label_lancelot_traitor.
 > roundtable:0004_label_lancelot_traitor
 - Migration 'roundtable:0004_label_lancelot_traitor' is marked for no-dry-run.
 - Loading initial data for roundtable.
Installed 0 object(s) from 0 fixture(s)

This allows us to verify our work.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.get(name__iexact="Lancelot").traitor
True

Migrations with South are an incredibly powerful tool and should be part of any solid Django website toolkit prior to Django 1.7, protecting your database schema and potentially important site data.

Migrations in Django 1.7

Given that Andrew Godwin, the creator of South, is behind the migration system of Django 1.7, it's a safe bet that the system will be just as good, if not better. Powerfully, while the commands are different, the workflow for migrations is the same.

Recall that at this point, we have created a camelot project with a roundtable app on our Django 1.7 site. To review, this involved the following commands:

$ django-admin startproject camelot
$ cd camelot/
$ ./manage.py startapp roundtable

We have also added 'roundtable', to the list of installed applications in the site settings (but not South!), and coded the Knight model. Notably, however, we have not created a database as we did with Django 1.6. Instead, we can skip directly to generating migrations.

$ ./manage.py makemigrations
Migrations for 'roundtable':
  0001_initial.py:
    - Create model Knight

Just as with South, Django will create a single migration file. However, the file, printed below, is very different.

# roundtable/migrations/0001_initial.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations


class Migration(migrations.Migration):

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Knight',
            fields=[
                ('id',
                 models.AutoField(
                     primary_key=True,
                     verbose_name='ID',
                     serialize=False,
                     auto_created=True)),
                ('name',
                 models.CharField(
                     max_length=63)),
            ],
            options={},
            bases=(models.Model,),
        ),
    ]

South migration files feature three central features: forwards(), backwards(), and the frozen models. Our initial schema migration is only really comprised of the operations list, which contains a single command, effectively acting as an equivalent to forwards(). Django 1.7 does not feature an equivalent to backwards(). Modern migrations are clever enough to be able to automatically identify the reverse process.

With our new migration file, we can now generate a database and apply the migration file not only for our app but also for all of the contributed library apps currently included in the project settings.

$ ./manage.py migrate
Operations to perform:
  Apply all migrations: auth, admin, contenttypes, roundtable, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying roundtable.0001_initial... OK
  Applying sessions.0001_initial... OK

Data Migrations in Django 1.7

Now that we have the basic structure of our database, we will want to fill it with data. We could start to do so manually, as before, and then output the result in a fixture.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.bulk_create([
...     Knight(name='Bedevere'),
...     Knight(name='Bors'),
...     Knight(name='Ector'),
...     Knight(name='Galahad'),
...     Knight(name='Gawain'),
...     Knight(name='Lancelot'),
...     Knight(name='Robin'),
... ])
$ ./manage.py dumpdata roundtable --indent=2 > \
      roundtable/fixtures/initial_data.json

Given our difficulties with initial data fixtures, it should come as no surprise that the system has been deprecated to make interaction with migrations more straightforward. Fixtures will still work fine, but there is no longer a mechanism in place to automatically load fixtures when creating or altering a database.

For the moment, let's keep our fixture file but rename it as a reminder that initial data fixtures no longer work in Django 1.7.

mv roundtable/fixtures/initial_data.json \
   roundtable/migrations/knight_data.json

Instead of the initial data system, Django now anticipates our use of data migrations. Given their utility in the previous section, this should not be a surprise either. Django 1.7 makes less of a distinction between schema and data migrations. Instead of generating a data migration, we simply ask for an empty migration file for our roundtable app.

$ ./manage.py makemigrations --empty roundtable
Migrations for 'roundtable':
  0002_auto_20140903_1600.py:

Django will also not let us name the migration file, naming it for us according to the date. I opt to rename the file for clarity. I am unaware of whether this is considered good or bad practice, but I find it helpful. (EDIT: Andrew Godwin thinks this is good practice.)

$ mv roundtable/migrations/0002_auto_20140903_1600.py \
     roundtable/migrations/0002_add_knight_data.py

The empty file is printed below.

# roundtable/migrations/0002_add_knight_data.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations


class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0001_initial'),
    ]

    operations = [
    ]

Observe that the dependencies list contains a tuple, which informs Django of the parent migration for each app. A migration file is for a single app, and so the list may seem unnecessary at first. However, in the event of a foreign key or a many-to-many relationship, the dependencies list will include tuples to migrations of other apps, ensuring that the models and the database remain in their correct states.

The dependencies list is further useful for teams of developers, as the system can help with conflicting migration files.

In South, our job was to implement forwards() and backwards(). In Django 1.7, our job is to add operations to the migration file. Django provides quite a few, but none are specifically targeted at data migrations. Instead, the official recommendation is to use the RunPython command. The command expects a callable and anticipates the callable to act like South's forwards(). The operation further allows for the reverse_code parameter, allowing a callable that acts like backwards(). In both cases, the callables must accept two arguments: an instance of an app registry and an instance of the SchemaEditor.

We will take an in-depth look at the new app registry in Part III of this article series. However, we will not cover the SchemaEditor.

Our new-found knowledge allows us to fill in our new migration file with a few skeleton functions and a call to the RunPython operation.

# roundtable/migrations/0002_add_knight_data.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations


def add_knight_data(apps, schema_editor):
    pass


def remove_knight_data(apps, schema_editor):
    pass


class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0001_initial'),
    ]

    operations = [
        migrations.RunPython(
            add_knight_data,
            reverse_code=remove_knight_data),
    ]

The app registry (the apps argument to our functions) is from Django 1.7's new app-loading mechanism. However, the app registry object being passed is not Django's current app registry. Django's migration system is changing the registry such that it reflects the point in time when the migration file was created. The interplay between migrations and the app loader is quite revealing: Django is using the two to calculate—on the fly and in memory—the historical model: what the models looked like at the time the migration file was created. Furthermore, this is why the app registry is being passed to our callables. Should the developer opt to query Django directly for the list of apps, Django will return the most recent version of the models in the app rather than the version that the migration system has built and expects.

While similar to South on the surface, this approach is actually quite different. South used the frozen model to provide the historical model via the orm variable. Native migrations, however, use the new app registry and previous migrations to provide the historical model via the apps argument to our function. The net effect is that new migrations files are simpler and easy to edit.

While not discussed here, the SchemaEditor object (the second parameter passed to our callable) is the second half of Django's migration system. Essentially, it takes the operations listed in migration files and transforms them into the SQL commands the database expects.

We could implement our add_knight_data() function using fixtures, as we did in South:

# roundtable/migrations/0002_add_knight_data.py
from django.core.management import call_command


def add_knight_data(apps, schema_editor):
    call_command('loaddata', 'knight_data.json')

Resetting our migrations and applying them will reveal their use, as before.

$ ./manage.py migrate roundtable zero
Operations to perform:
  Unapply all migrations: roundtable
Running migrations:
  No migrations to apply.
$ ./manage.py migrate roundtable
Operations to perform:
  Apply all migrations: roundtable
Running migrations:
  Applying roundtable.0001_initial... OK
  Applying roundtable.0002_add_knight_data...
  Installed 7 object(s) from 1 fixture(s)
 OK

However, as before, I would prefer to centralize the data migration to a single file, generating the data directly.

# roundtable/migrations/0002_add_knight_data.py
def add_knight_data(apps, schema_editor):
    Knight = apps.get_model('roundtable', 'Knight')
    Knight.objects.bulk_create([
        Knight(name='Bedevere'),
        Knight(name='Bors'),
        Knight(name='Ector'),
        Knight(name='Galahad'),
        Knight(name='Gawain'),
        Knight(name='Lancelot'),
        Knight(name='Robin'),
    ])

We can, as before, roll back all the migrations and then reapply them.

$ ./manage.py migrate roundtable zero
Operations to perform:
  Unapply all migrations: roundtable
Running migrations:
  Unapplying roundtable.0002_add_knight_data... OK
  Unapplying roundtable.0001_initial... OK
$ ./manage.py migrate roundtable
Operations to perform:
  Apply all migrations: roundtable
Running migrations:
  Applying roundtable.0001_initial... OK
  Applying roundtable.0002_add_knight_data... OK

We can verify that our migration is being applied correctly by using the shell, as before.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.all()
[<Knight: Bedevere>,
 <Knight: Bors>,
 <Knight: Ector>,
 <Knight: Galahad>,
 <Knight: Gawain>,
 <Knight: Lancelot>,
 <Knight: Robin>]

This leaves us only to program our remove_knight_data function, which acts as the equivalent to backwards().

# roundtable/migrations/0002_add_knight_data.py
def remove_knight_data(apps, schema_editor):
    Knight = apps.get_model('roundtable', 'Knight')
    Knight.objects.all().delete()

With that done, we have a very effective data migration file in Django 1.7.

Dealing with Lancelot (fool me twice...)

As in our South example, we now find ourself betrayed by Lancelot and must add the traitor field to our Knight model.

# roundtable/models.py
class Knight(models.Model):
    ...
    traitor = models.BooleanField()

Note that again, I am intentionally not providing a default value to the field. Whereas Django 1.6 didn't seem to notice, Django 1.7 does. The new check system, if invoked, will warn us of the problem.

$ ./manage.py check
System check identified some issues:

WARNINGS:
roundtable.Knight.traitor: (1_6.W002) BooleanField does not have a default value.
    HINT: Django 1.6 changed the default value of BooleanField from False to None. See https://docs.djangoproject.com/en/1.6/ref/models/fields/#booleanfield for more information.

System check identified 1 issue (0 silenced).

Should you forget to run the check system, you need not worry. Django 1.7 will warn you about the problem when you attempt to generate the migration file for the change.

$ ./manage.py makemigrations
System check identified some issues:

WARNINGS:
roundtable.Knight.traitor: (1_6.W002) BooleanField does not have a default value.
    HINT: Django 1.6 changed the default value of BooleanField from False to None. See https://docs.djangoproject.com/en/1.6/ref/models/fields/#booleanfield for more information.
You are trying to add a non-nullable field 'traitor' to knight without a default;
we can't do that (the database needs something to populate existing rows).
Please select a fix:
 1) Provide a one-off default now (will be set on all existing rows)
 2) Quit, and let me add a default in models.py
Select an option:

At this point, you should pick option two, fix the model, and then come back. Instead, to demonstrate why it's a bad idea, we will pick option one, set a one-time default, and continue as if nothing was amiss.

Select an option: 1
Please enter the default value now, as valid Python
The datetime module is available, so you can do e.g. datetime.date.today()
>>> False
Migrations for 'roundtable':
  0003_knight_traitor.py:
    - Add field traitor to knight

Django's new check system, however, is quite persistent, as it should be. When we apply the migration, it will continue to warn us about the problem.

$ ./manage.py migrate
System check identified some issues:

WARNINGS:
roundtable.Knight.traitor: (1_6.W002) BooleanField does not have a default value.
    HINT: Django 1.6 changed the default value of BooleanField from False to None. See https://docs.djangoproject.com/en/1.6/ref/models/fields/#booleanfield for more information.
Operations to perform:
  Apply all migrations: auth, sessions, admin, roundtable, contenttypes
Running migrations:
  Applying roundtable.0003_knight_traitor... OK

While the migration has successfully applied, it is in our best interest to fix the issue directly in our models. The Knight model file declarations should thus read:

# roundtable/models.py
@python_2_unicode_compatible
class Knight(models.Model):
    name = models.CharField(max_length=63)
    traitor = models.BooleanField(default=False)

Django 1.7 will detect the change and create a new migration when we invoke makemigrations.

$ ./manage.py makemigrations
Migrations for 'roundtable':
  0004_auto_20140903_2035.py:
    - Alter field traitor on knight

A comparison of the third and fourth migrations is in order. The third migration adds the traitor field to the Knight model.

# roundtable/migrations/0003_knight_traitor.py
class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0002_add_knight_data'),
    ]

    operations = [
        migrations.AddField(
            model_name='knight',
            name='traitor',
            field=models.BooleanField(default=False),
            preserve_default=False,
        ),
    ]

The subtlety here is the use of preserve_default. It informs the migration system of the nature of the default parameter. If preserve_default is True, then the default parameter of the field being added is part of the field. If preserve_default is False, as is the case, then it signifies the default is a one-off setting, provided by the developer after being prompted, as we were.

The fourth migration alters the traitor field.

#roundtable/migrations/0004_auto_20140903_2035.py
class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0003_knight_traitor'),
    ]

    operations = [
        migrations.AlterField(
            model_name='knight',
            name='traitor',
            field=models.BooleanField(default=False),
        ),
    ]

The AlterField operation does not take a preserve_default. The default field option is thus exactly what it seems, and it is the key modification.

For the curious, if you were to remove the default=True field option such that the traitor field now read (again):

# roundtable/models.py
class Knight(models.Model):
    ...
    traitor = models.BooleanField()

Generating a fifth migration would yield the following, extremely similar file:

$ cat roundtable/migrations/0005_auto_20140903_2040.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations


class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0004_auto_20140903_2035'),
    ]

    operations = [
        migrations.AlterField(
            model_name='knight',
            name='traitor',
            field=models.BooleanField(),
        ),
    ]

Observe the BooleanField() is without field options. Note that I have not included these changes in the supplied Git repository, and that the article proceeds without the changes in this aside..

Developers should thus pay close attention to the output of commands, as the check system may help them avoid simple mistakes. However, should you end up in this position, the solution is simple (assuming you have not already committed the changes to a public shared repository). Recall that one of the key migrations design choices is that migration files are meant to be editable. Developers may opt to edit the migration files or regenerate them completely.

If you have already committed these changes to a public repository, you may find the squashmigrations command to be helpful.

Both courses of action are quite simple. Assuming we have already added a corrected traitor field (as we have done), we could either:

  • Regenerate the file:
    1. Delete the third and fourth migration files
    2. Execute $ ./manage.py makemigrations
  • Edit the migration files:
    1. Delete the fourth migration
    2. Set preserve_default to True in the third migration file.

I find the first option most appealing, as it is less prone to error. I first remove the migrations:

$ rm roundtable/migrations/0003_knight_traitor.py
$ rm roundtable/migrations/0004_auto_20140903_2035.py

And then generate a new one.

$ ./manage.py makemigrations roundtable
Migrations for 'roundtable':
  0003_knight_traitor.py:
    - Add field traitor to knight

The file is printed below. Observe that preserve_default is set to True.

# roundtable/migrations/0003_knight_traitor.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations


class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0002_add_knight_data'),
    ]

    operations = [
        migrations.AddField(
            model_name='knight',
            name='traitor',
            field=models.BooleanField(default=False),
            preserve_default=True,
        ),
    ]

South veterans may note that we have not made use of the equivalent of the --update flag provided by South's schemamigration command. This flag would have allowed us to simply replace the original third migration with this latest one. Django's new migration system does not supply an equivalent. The expectation is that developers will be more proactive with their migration files, editing them directly or else in a position to replace them, as we did above.

We now have a correct database schema. As before, we may now use a data migration to set Lancelot's traitor status to True. We start by generating an empty migration file.

$ ./manage.py makemigrations --empty roundtable
Migrations for 'roundtable':
  0004_auto_20140903_2045.py:

I will opt to change the file name.

$ mv roundtable/migrations/0004_auto_20140903_2045.py \
     roundtable/migrations/0004_label_lancelot_traitor.py

Given the empty migration file, we first add the RunPython operation, and anticipate the creation of two functions. The first function will act as an equivalent to South's forwards(), while the other, passed as a parameter to reverse_code, will act as an equivalent to backwards().

# roundtable/migrations/0004_label_lancelot_traitor.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations

...

class Migration(migrations.Migration):

    dependencies = [
        ('roundtable', '0003_knight_traitor'),
    ]

    operations = [
            migrations.RunPython(
                set_lancelot_traitor,
                reverse_code=unset_lancelot_traitor)
    ]

We can thus begin by programming a function called set_lancelot_traitor. Recall that RunPython will pass in two arguments. The first will be an app registry object, while the second will be a schema editor. Recall that the migration system builds the app registry in memory so that it reflects the state of the apps and models at the time the migration file was generated. We can thus use the app registry object being passed to obtain the historical Knight model. Using the model manager on the historical model, we can get the data related to Lancelot and set his traitor status to True in the database, remembering to save once we're done.

# roundtable/migrations/0004_label_lancelot_traitor.py
def set_lancelot_traitor(apps, schema_editor):
    Knight = apps.get_model('roundtable', 'Knight')
    lancelot = Knight.objects.get(
        name__iexact='Lancelot')
    lancelot.traitor = True
    lancelot.save()

Our unset_lancelot_traitor, acting as the backwards() function, does the same thing as the function above, except that we are setting Lancelot's traitor status to False. The code is thus nearly identical.

# roundtable/migrations/0004_label_lancelot_traitor.py
def unset_lancelot_traitor(apps, schema_editor):
    Knight = apps.get_model('roundtable', 'Knight')
    lancelot = Knight.objects.get(
        name__iexact='Lancelot')
    lancelot.traitor = False
    lancelot.save()

Before we apply the migration, we can see that the data in our database reflects any Knight's expected loyalty.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.get(name__iexact="Lancelot").traitor
False

We then apply the new migration using migrate.

$ ./manage.py migrate roundtable
Operations to perform:
  Apply all migrations: roundtable
Running migrations:
  Applying roundtable.0004_label_lancelot_traitor... OK

In the shell, Lancelot's loyalty is now correctly reflected.

$ ./manage.py shell
>>> from roundtable.models import Knight
>>> Knight.objects.get(name__iexact="Lancelot").traitor
True

You may now email King Arthur informing him that he has a fully upgraded website.

Lessons from Camelot

While the commands and details in Django's new migration system are quite different, the overall workflow is identical, requiring three steps for each and every change to model structure:

  1. Make a change to the model
  2. Generate a migration file
  3. Use the migration file to create/alter the database

With our data migrations, we found ourselves adding three more steps:

  1. Create a data migration
  2. Edit the data migration for our purposes
  3. Apply the data migration

These last three steps were an effective improvement over using the initial data system, which is now deprecated.

In Django 1.7, the two commands necessary for working with the system are makemigrations and migrate. For creating a data migration, the --empty flag is necessary. In our example, we required nothing more.

The key to working with the new migration system is to remember that it only looks forward. The operations list automatically determines the symmetric command for each operation, allowing for reverse (or backward) operations.

Finally, the new migration system makes direct editing of migration files less daunting and prone to problems. Additionally, the new migration system does not contain a frozen model like South migrations do, instead calculating in memory what the model looked like at the time the migration was created. To do this, the migration system uses the new app registry to store the result.

With Camelot, our focus on creating a basic new project provided us with a solid introduction to native migrations as well as a peek at the new app loading system and the systems check framework. In Part III of this article series, to be published October 8, 2014, we will fully examine all three of these new features, as well as a few other important tools.

I will announce the release of Part III on both Twitter and Django's User Mailing List.