Models

Models are the core of any Django project. But any sufficiently mature Django project is very likely to be suffering from model layer bloat. This is a direct result of the unspoken default answer to the question of where to put business logic in a Django project: you're supposed to put it in the model layer, usually as methods on model classes. This, I'll argue, is a huge mistake.

Monster models¶

Any real-world application tends to have core models and ancillary models. Core models are those directly related to the domain: in a bookstore application, core models might include Book, Author, Order, Customer. Ancillary models are those that are tangentially related to the domain, or those that are related to the business of being a web application: ReviewRating, Promotion, LoginAttempt, PermissionGroup.

In most codebases, a small subset of the core models often end up containing a large proportion of the business logic of the application. This makes sense if you consider that any software application is usually about something: sports management software might be about players and teams, project management software might be about projects and tasks, and so on. It's understandable that the majority of the business logic in such applications would be associated with a small set of these core concepts.

When business logic is put in model methods, models (particularly core models) tend to grow uncontrollably over time. As your application evolves, these models accumulate more and more functionality, becoming massive files that are difficult to comprehend and navigate, making it challenging for developers to understand the entire model's behavior or to make changes confidently. These "monster models" become a significant maintenance burden, as working with them requires loading a substantial mental model of all their interconnected behaviors.

Append-only¶

Once a model class reaches a threshold of size and complexity, it tends to become append-only: new functionality is continually added but rarely removed. There's so much code in the model that a developer trying to add functionality to the application has no way of knowing (without reading thousands of lines of code) whether something like what they're trying to achieve already exists, so they tend towards the safe option of just adding new code that does exactly what they need.

The codebase ends up with multiple implementations of essentially the same functionality, often with slightly different behaviour, and usually with inconsistent names. This redundancy not only bloats the codebase but also creates confusion when other parts of the application use subtly different versions of the same logic.

Too many responsibilities¶

Django's model layer provides a comprehensive abstraction over relational database concepts. This is a complicated thing to do! By subclassing a model and adding your own business logic, you're making an already complicated thing even more complicated, in an orthogonal direction to its existing complexity.

Methods? Properties? Reads? Writes?¶

In real-world projects, it's very common to see confusion about what language features should be used to expose the business logic in a model.

If a method implements some derived state and returns a value, should it be decorated with @property to make it "look like" a model field? If a method changes the model's state, should it call self.save() or leave that to the caller? How would a caller know whether a method implements an unsafe state-changing operation, or a safe data-reading operation? When should custom table-level or query-level behaviour live in a queryset and when should it live in a manager? Where should logic that manipulates multiple types of objects live?

Different developers will have different answers to these questions, and even with the best intentions, inconsistencies will appear.

Security¶

When business logic resides in model methods, the entire API surface of the model is available to every part of the application that imports the model. There's no way to restrict access based on the context in which a model is being used. There's nothing preventing a developer from calling missile.launch() in a part of the codebase accessible to low-privilege users. This lack of encapsulation and access control makes it difficult to enforce security boundaries within your application.

Performance¶

Perhaps most importantly, putting behaviour in model methods is a gateway to terrible performance.

Much of the functionality in any piece of software requires navigating relationships between models. If your business logic is in the models themselves, it's tempting to navigate these relationships inside the model method. missile.time_to_launch() might perform a complex query against the LaunchSchedule model. If you're dealing with a single missile this might be fine, but once you're rendering a table of missiles, you've created an N+1 query problem.

In the real world, complex business requirements often require navigating relationships across multiple models, interleaved with business logic written in Python. Once these patterns have been established in a codebase, they can be extremely difficult to optimise away with the usual select_related and prefetch_related.

What to do instead¶

Keep models minimal. Think of model classes as Python definitions of database tables.

A model should consist of a set of field declarations, a Meta class for configuration if necessary, and an __str__ to provide a friendly string representation in the shell (and the implementation of __str__ should absolutely not ever traverse a relationship to another model). Field choices should usually be defined as enum classes inline in the model class body, accessed via models.SomeModel.SomeChoices.SOME_VALUE (unless they need to be shared between models).

Inheritance¶

Models should never inherit from anything other than Django's Model. Reading the field definitions of a model should completely describe the structure of the table it represents: you should not need to go and read some other mixin definition in another file to understand where a bunch of extra fields or behaviour came from. Reject any third-party dependency that requires you to change the base class of your models.

Common fields and relationships¶

If you have common fields that you want to exist on every model in your application (id as a UUIDField, created_at, updated_at etc), it's better to declare them explicitly on every model, and implement a Django system check to ensure that they aren't forgotten. While you're at it, add a check to ensure that every ManyToManyField has an explicit through= argument that points to a model that exists in your codebase, not an autogenerated relationship model as is the default. This means you can add your common fields to the relationship model too.

Structure¶

As discussed in the page on project structure, models should all go inside a single app called data inside your project package, with one file per model. Import all the models into project/data/models/__init__.py. Elsewhere in your codebase, the data layer should always be imported as from project.data import models.

One tedious downside of this single-app approach is that all of your database tables will end up being named data_somemodel and data_someothermodel. This is because Django, by default, prepends the name of the app to the front of the name of each model when generating table names. This isn't a dealbreaker, and if you can live with it, no problem. If you can't, there's a somewhat-magical-but-straightforward solution. Drop the code below into project/data/__init__.py and never think about it again:

from django.db.models.signals import class_prepared
from django.dispatch import receiver
import re

@receiver(class_prepared)
def remove_table_prefix(sender, **kwargs):
    sender._meta.db_table = re.sub(r"^data_", "", sender._meta.db_table)

Business logic¶

All of the above leaves one obvious question: if models should be little more than table gateways, where should custom business logic go? We'll get to that soon.

Summary

The commonly-accepted convention of putting the majority of an application's business logic into the model layer (usually in model methods) can lead to a host of problems including poor code organisation, security issues, and catastrophic performance. Instead, model classes should be kept minimal and seen as lightweight declarative definitions of their underlying database tables.

Sources¶

How to structure Django projects by James Beith.