Migrating from Django-Tagging to Taggit

When Bucketlist launched a year ago and I needed a good app to let users create a taxonomy for their life goals, django-tagging was the main contender, and that’s what we went with.

Django-tagging worked pretty well overall, but had one critical bug: Because it only had a tag “name” field but no slug field, users could enter tags with slashes in them. Accessing lists of those tags would then generate a 500 error – a bad user experience, unclean, and I was getting tired of seeing the error reports. Unfortunately, django-tagging hasn’t been been updated in quite a while – starting to look like abandon-ware.

At Djangocon 2010, buzz was that Alex Gaynor’s django-taggit was picking up the slack and becoming the go-to tagging library for Django. Unfortunately, Taggit provides no migration strategy to move your existing tag base over. I held off on migration hoping one would appear, then finally decided this week to try it myself. Thought I’d document the process for others in the same boat.

There are two possible routes you can take – you can either:

1) Temporarily keep both apps installed, then write a script or work in the ORM to loop through tags and tagged_items in the old system, copying their values to instances in the new one, or:

2) Modify the table structures of the tables created by django-tagging to exactly match those created by Taggit, then rename the tables. With this approach, no data is actually migrated – the old table structures are modified around the existing data.

Since the table structures between the two apps are fairly similar, I went for the second method.

We’ll assume your old tagged model uses the default field name “tags” and it’s going to stay that way across the migration.

Start by installing Taggit and the add-on Taggit-TemplateTags:

pip install django-taggit
pip install django-taggit-templatetags

Add taggit to INSTALLED_APPS and sync your db to get the new table structures.

Now remove django-tagging from INSTALLED_APPS and from all of your imports (in models, forms, admin.py), completely disabling it. For now, KEEP the django-tagging tables in your database.

Convert your model field to:

tags = TaggableManager()

Since tags should always be optional, you’ll probably want to throw a “blank=True” on there. Use the docs to get django-taggit working everywhere it needs to (form input, cloud, tag listing on model instance pages, etc.). One obstacle I hit was the fact that Taggit’s cloud generator doesn’t let you specify a minimum number of occurrences before a tag is displayed. Since the tag set on Bucketlist is very large, I really needed to limit this. Unfortunately, the only way I was able to solve it was to use something like this in the template:

{% get_tagcloud as tags for 'bucket.item' %}
{% for tag in tags %}
{% if tag.num_times > 5 %}
{% endif %}
{% endfor %}

I really don’t like that solution, since it means querying for the entire tag set and evaluating the occurrence number of each during the iteration – very inefficient. For now, I’ve papered over the problem by implementing Django caching on the tag cloud page.

The other significant difference here is that while django-tagging only provided a “name” field, Taggit offers both a name field and a slug field, so we need to key our URLs off the slug. This is both better and worse than the old way – better because we can potentially have tag names in mixed case, but worse because there’s no earthly reason why you’d want to have both “Music” and “music” tags on your site – these should be in one bucket, not two. Django-tagging has a “force-lowercase” setting option that handles this nicely – Taggit does not (I’ve opened a ticket on that).

Once you’ve got your forms wired up, you should be able to enter dummy tag data and confirm that everything’s working. Now the trick is to use your favorite database admin tool to make all the field names and properties on the old tables identical to the new ones. This worked for me:

ALTER TABLE `your_db_name`.`tagging_tag` ADD COLUMN `slug` VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT ''  AFTER `name`;

ALTER TABLE `your_db_name`.`tagging_tag` CHANGE COLUMN `name` `name` VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT ''  COMMENT ''  AFTER `id`;

ALTER TABLE `your_db_name`.`tagging_taggeditem` CHANGE COLUMN `object_id` `object_id` INT(11) UNSIGNED NOT NULL DEFAULT 0  COMMENT ''  AFTER `content_type_id`;

ALTER TABLE `your_db_name`.`tagging_taggeditem` CHANGE COLUMN `object_id` `object_id` INT(11) UNSIGNED NOT NULL DEFAULT 0  COMMENT ''  AFTER `tag_id`;

There’s one final critical bit – make sure the index on the taggit_tag table is set to “slug” and not “name”, as it will be if you’ve used this table renaming process. If you fail to do this, your site will hang (not crash) with a race condition when people try to enter tags that match an existing tag but are rendered in another case (e.g. Music and music). Use whatever db administration tool you use to edit the index and set it to name “slug” and column “slug”.

Now you can give Taggit’s tables a temporary name and rename the old django-tagging tables to the names that Taggit wants to use:

rename table taggit_tag to taggit_tag-new
rename table taggit_taggeditem to taggit_taggeditem-new
rename table tagging_tag to taggit_tag
rename table tagging_taggeditem to taggit_taggeditem

Because Taggit has both “name” and “slug” fields, you’ll need to use the ORM to populate/back-fill values in the slug column. The easiest way is to import the module that does slugification in Django templates, and use it in the ORM via manage.py shell:

from django.template.defaultfilters import slugify
>>> tags = Tag.objects.all()
>>> for tag in tags:
...   tag.slug = slugify(tag)
...   tag.save()

You should be good to go!

There’s one other small modification I made – I needed the ability to search through Bucketlist’s 6000+ tags easily, but neither django-tagging nor Taggit provide search in the admin, and I didn’t want to modify their source code. The trick is to unregister the Tag class in your admin.py, write a dinky little replacement admin class, and associate Taggit’s models with your own custom class. So my app’s admin.py now uses this:

from taggit.models import Tag, TaggedItem
class MyTagAdmin(admin.ModelAdmin):
    list_display = ["name","slug"]
    search_fields = ('name','slug')
# Override Taggit's admin

Bingo – you can now search through tags in the admin, without having to hack Taggit.

Leave a Reply

Your email address will not be published. Required fields are marked *