Moving parts
"Moving parts" is a term borrowed from mechanical engineering which is used to describe any components of a machine that rotate, reciprocate, expand, hinge, or otherwise move against each other. Machines with fewer such moving parts are generally considered more durable and long‑lasting because of reduced wear and tear, and also tend to be easier to make and easier to repair.
In a software setting, I’m using the term "moving parts" to mean the individual running pieces of the overall system architecture. What are the processes that need to be executing on one or more computers somewhere for the system to operate?
Almost all Django applications have a web server process, of which there may be one instance or many thousands. It serves HTTP traffic coming from the Internet, usually returning either JSON or HTML responses.
Almost all Django applications will be backed by some sort of relational database, whether it’s PostgreSQL, MySQL, or SQLite.
Finally, many applications will have one or more background processes: code that runs outside the context of an HTTP request/response cycle. This could be in the form of asynchronous background jobs, usually run on a queue of some sort, or scheduled tasks that run at particular times throughout the day, week, or month.
It's probably not helpful for me to be prescriptive about how you deploy your application. There are so many possibilities in terms of hosting platforms and execution environments that providing specific examples will almost certainly not be relevant to most readers. So whether you use Heroku or Fly.io or one of AWS's many offerings or run your own cluster, the advice here still applies.
The purpose of this chapter is simply to encourage you, as in a mechanical system, to keep your software system running with as few moving parts as possible. This is another argument for simplicity, but simplicity is particularly important here because, in an unscientific sense, adding components to your system doesn't linearly increase the complexity with the number of components, it tends to exponentially increase it (in that hard-to-define cognitive‑load way).
Microservices¶
The first topic to address is that of microservices vs monoliths. It'll probably come as no surprise that I would argue in the strongest terms possible to favour a monolithic architecture over one comprised of multiple distributed components that communicate over a network. I won't go into depth here: much has been written on the topic, and there are many angles to consider: complexity, data consistency, organisational responsibility, and more. All I'll say on the topic, having worked on both monolithic and distributed Django applications, is that in my experience, monoliths are far cheaper to build, far easier to maintain and far easier to scale.
Caching¶
From reading books, blog posts and documentation about scaling Django, you might be fooled into thinking that it is absolutely essential to have some sort of caching system in a Django application. I'm here to tell you that it isn’t.
Yes: if you're running a Django application that serves HTML content to non‑logged‑in users (in other words, a public website) then caching tends to be a no‑brainer. It's fairly straightforward, usually, to implement a policy where HTML is cached on the way out of the application and the cache is simply wiped whenever any change is made. This often makes a massive difference to low‑to-medium traffic websites because assembling a webpage from a content management system is a query-intensive and complex thing to do, and that intensive complexity needs to be repeated for all users of the website in exactly the same way on every page load if the content isn’t cached.
In almost any other setting, particularly in the case of feature-rich multi-user web applications, the time spent by engineers tasked with figuring out when and how to invalidate the cache could be better spent just making the database queries faster. Web applications with caching usually suffer from issues with stale data, which will confuse users more than almost anything else. Making an edit to some data and not seeing that edit reflected immediately is an inexcusably awful user experience.
Of course, in the case of extremely high-scale web sites and applications, the above likely doesn't apply. But those sorts of applications are so vanishingly rare that any given engineer reading this book is highly unlikely to be working on one. Following advice intended for applications serving millions of requests per second for your project that receives tens of requests per second (or even hundreds, or even thousands) is a recipe for misery.
Even in an application that would benefit significantly from a cache, it's not necessarily the case that adding a cache means adding a moving part. Although the obvious choice would be to use a separate data store (often Redis) as your cache server, this creates another point of failure in your system, another piece of infrastructure that needs to be deployed, maintained and monitored, and another source of configuration that needs to be managed. Django comes with a production-ready database cache backend that stores your cache in a separate table in your existing database server. Databases tend to be extremely fast at key lookups these days, so for medium-to-high traffic sites, this approach is likely to be just fine.
Queues¶
A good rule of thumb is that a request to your application should never result in a synchronous request to another application that you don't control the latency of (in other words, a third-party system). If you do this, network or performance issues in the third-party system are guaranteed to bring your system to its knees, sooner or later. Instead, requests to external systems (and any other sort of indeterminate-duration work) should be delegated to a background process, via a queue.
Inserting a queue between critical infrastructure components creates an asynchronous boundary that provides temporal decoupling and latency isolation. The queue absorbs spikes and variability in the external service, keeps your request path short and predictable, and gives you a controlled place to apply concurrency limits, retries with backoff, and load shedding if needed. When the third party slows down, the backlog grows in the queue rather than in your web workers, so the rest of your system stays responsive.
While adding a task queue to an application is generally a complexity price worth paying, we still have to be mindful of our moving parts. Almost all queueing systems would naturally add at least two moving parts to the system: the queue worker process itself, and the broker: the server that holds the state of the queue. A separate queue worker process is necessary here, but a separate broker often isn't: again, your existing database is probably just fine. Consider something like django-db-queue (disclaimer: I'm the author of this package) or django-tasks using the DatabaseBackend, which will likely soon be merged into Django itself.
Scheduling¶
Some applications have processes that need to run at particular times of the day. How this is achieved will depend largely on how you're hosting the application. If it's on a bare-metal server, the obvious approach would be to use cron. On Heroku, you might want to use Heroku Scheduler, or if you need total platform independence you might choose to run your own clock process using something like APScheduler.
Whichever way you go, it's usually sensible to break up heavyweight bulk tasks (like "email all users") into multiple smaller tasks ("email one individual user") and add them to your task queue at low priority, allowing the queue worker to chew through them in its own time, and supporting easier parallelisation if needed.
Summary
Taking on additional system components (processes and/or stateful backing services) can exponentially impact the overall complexity of your architecture. Do this extremely carefully and only when absolutely necessary, and lean on your existing relational database as much as possible.