Writing a custom field in Django

Reusable code is one of the most talked-about silver bullets in our toolkit. Django has a huge ecosystem of reusable apps, but not nearly so many reusable fields. Apps and fields generally solve different problems, though, and while an app can be retemplated or modified to fit your use case, a field is often times so specialized or exact in its purpose that either you need it or you don’t. In this entry I’ll discuss a great use-case for a field we had at work building out live March Madness coverage, and a popular field that is also an app.

Getting started

When writing a custom field, a couple things are needed:

  • Can a pre-existing field type be extended?
  • What date type do I want to deal with in python?
  • How does this get stored in the database?
  • How will the serialization framework handle the data?
  • What kind of form field / widgets will be used to interact with this data?

Basketball

I’m a Jayhawk fan, and happily enough, I work on a news site dedicated solely to Jayhawk sports coverage. When modelling sports data, one thing you have to take into account is game time. This seemed like a great use case for a field that mapped game times to python’s datetime.timedelta. After reading through ticket 2443 I came to the conclusion that it was going to be overkill for what we needed – which was minutes and seconds – so I rolled my own, calling it SimpleDurationField.

Extending IntegerField to store seconds

Determining the underlying storage was pretty straightforward – every database knows how to store integers, and a timedelta can be expressed as seconds. What needs to be figured out is how to make sure our models give us timedeltas straight from the database, so there’s no conversion step necessary, and additionally, to convert the timedelta to an integer for storage. Note that I’m specifying SubfieldBase as the metaclass for the field – this ensures that Django will call my field’s to_python() method (discussed below).

syntax:python
from django.db.models.fields import IntegerField
from django.db import models

class SimpleDurationField(IntegerField):
    __metaclass__ = models.SubfieldBase

The next step is to make sure we’re getting a timedelta whenever data is pulled from the db or when data is assigned to the field. This is where the to_python() hook comes in. It makes it possible to transform the data from the database (or during assignment) into a more friendly python object. Here is what it looks like:

syntax:python
def to_python(self, value):
    if not value:
        return None
    if isinstance(value, (int, long)):
        return datetime.timedelta(seconds=value)
    elif isinstance(value, basestring):
        minutes, seconds = map(int, value.split(':'))
        return datetime.timedelta(seconds=(seconds + minutes*60))
    elif not isinstance(value, datetime.timedelta):
        raise ValidationError('Unable to convert %s to timedelta.' % value)
    return value

Now that the field is correctly returning timedelta objects from the database, the reverse needs to be implemented — making sure timedeltas are converted to integers for storage. This is done in the get_db_prep_value() method of the field:

syntax:python
def get_db_prep_value(self, value):
    return value.seconds + (86400 * value.days)

datetime.timedelta stores duration in days, seconds and microseconds, and the above method simply flattens out the days, converting them to seconds and returning an integer.

The serialization framework utilizes another method on the field, value_to_string(), which, as you would guess, converts the value of your field to a string. If you’re familiar with Django’s fixtures, you’ve seen these before. Converting the timedelta into minutes and seconds:

syntax:python
def value_to_string(self, instance):
    timedelta = getattr(instance, self.name)
    if timedelta:
        minutes, seconds = divmod(timedelta.seconds, 60)
        return "%02d:%02d" % (minutes, seconds)
    return None

The final step is to create a form_field for the timedelta that will do some simple validation. This involves adding a form_field() method to the field class:

syntax:python
def formfield(self, form_class=SimpleDurationFormField, **kwargs):
    defaults = {"help_text": "Enter duration in the format: MM:SS"}
    defaults.update(kwargs)
    return form_class(**defaults)

Note that form_class is specified as SimpleDurationFormField. That hasn’t been defined yet, so here it is:

syntax:python
from django.forms.fields import CharField
from django.forms.util import ValidationError as FormValidationError

class SimpleDurationFormField(CharField):
    def __init__(self, *args, **kwargs):
        self.max_length = 10
        super(SimpleDurationFormField, self).__init__(*args, **kwargs)

    def clean(self, value):
        value = super(CharField, self).clean(value)
        if len(value.split(':')) != 2:
            raise FormValidationError('Data entered must be in format MM:SS')
        return value

It seems like a lot, but you may be able to get away with less – it all depends on how much you want your field to do for you. As usual, the docs are stellar, and provide a more in-depth discussion of some of the hooks available when building your own custom fields.

A more sophisticated example: tags

Like it or not, django-tagging delivers probably the best feature set for a reusable Django tagging app (keep an eye on django-taggit though). It is actually a hybrid of a reusable app with its own models and a reusable field that provides a gateway to the app’s functionality. Looking at the source code for the field, it appears that there’s a lot going on, but the “magic” behavior is implemented in just a handful of methods.

contribute_to_class()

syntax:python
def contribute_to_class(self, cls, name):
    super(TagField, self).contribute_to_class(cls, name)

    # Make this object the descriptor for field access.
    setattr(cls, self.name, self)

    # Save tags back to the database post-save
    signals.post_save.connect(self._save, cls, True)

    # Update tags from Tag objects post-init
    signals.post_init.connect(self._update, cls, True)

When the field is added to a model, and that model is loaded by Django, two signals are set up by the field. The post_init signal is used to prepopulate the field, which acts like a CharField of comma separated tags. There’s a bit of misdirection going on, but on line 103 we see the tags actually being loaded from the database for the model instance. The post_save signal is the corollary to post_init, and handles retrieving attribute-cached tags and updating the database whenever a model instance is saved (see line 76).

Everything else the tagging app does should be reasonably familiar (except all the crazy SQL, of course). django-taggit also acts like a field, but instead of hiding the magic behind a CharField, the tags are accessed like related objects. I recommend checking out the source, there’s some cool ideas at work there.

Conclusion

I hope you found this post interesting and maybe useful. Any suggestions or ideas for improvements are appreciated.

Read full article at “charlesleifer.com: Entries tagged with "django"”

Leave a comment