Fixing Slow Django APIs in Production: A Real-World Performance Audit Playbook
When a Django API is slow in production, the cause is almost never “Django is slow.”
In real SaaS systems, latency comes from compounding issues: ORM misuse, inefficient serialization, missing database indexes, and infrastructure misalignment. Tutorials that focus on micro-optimizations miss the actual problem.
This article walks through the exact performance audit process we use at Particel Agency when production Django systems are slow under real traffic.
This is not a toy example. This is how we diagnose and fix systems that are already live, already generating revenue, and already under pressure.
The Production Context
The system (details anonymized):
-
B2B SaaS platform
-
Django + Django REST Framework
-
PostgreSQL
-
~300k monthly users
-
p95 API latency: ~2.3 seconds
-
Intermittent timeouts during traffic spikes
-
CPU usage looked “fine” (misleading symptom)
The team assumed they needed Redis, more servers, or a rewrite.
They didn’t.
Step 1: Measure Reality Before Touching Code
Most performance failures begin with premature optimization.
We do not start by refactoring code or adding caching.
We start by answering one question:
Where is the time actually going?
Tooling We Use
-
Django Debug Toolbar (staging only)
-
django-silk or OpenTelemetry
-
PostgreSQL
pg_stat_statements -
Nginx access logs
-
Controlled load testing (Locust or k6)
Metrics That Matter
-
p50 / p95 / p99 latency
-
Query count per request
-
Slowest SQL queries
-
Database time vs Python/serialization time
If you cannot explain latency with numbers, you are guessing.
Step 2: Detect ORM Abuse (The Most Common Root Cause)
The slowest endpoint looked harmless:
But under profiling:
-
One request triggered 147 SQL queries
-
Nested serializers caused implicit queries
-
SerializerMethodFieldaccessed relations per object
The Fix
Serializer cleanup:
-
Removed DB-backed
SerializerMethodField -
Replaced with annotated fields or preloaded relations
Result
-
Queries per request: 147 → 6
-
p95 latency: 2.3s → ~480ms
This alone solved more than half the problem.
Step 3: PostgreSQL Indexes Based on Reality, Not Assumptions
Django migrations do not automatically create optimal indexes.
We inspect real query plans, not model definitions.
What We Look For
Red flags:
-
Sequential scans on large tables
-
Filters on unindexed foreign keys
-
ORDER BY on non-indexed columns
Real Fix Example
Indexes must reflect actual query patterns, not abstract design.
Step 4: Serialization Cost Is Not Free
In production APIs, Django REST Framework serialization often rivals database time.
We measure:
-
DB time vs Python execution time
-
JSON rendering cost
Optimization Patterns
-
Use
.only()or.values()for read-heavy endpoints -
Replace DRF serializers on hot paths with:
-
Lightweight custom serializers
-
Manual dict construction
-
orjsonfor JSON rendering
-
This frequently reduces latency by 20–40% on high-traffic endpoints.
Step 5: Caching Comes Last (Not First)
Caching broken queries hides problems instead of solving them.
We cache only after the fundamentals are correct.
What We Cache
-
Read-heavy, low-volatility endpoints
-
Computed aggregates
-
Permission-safe responses
Example:
What We Never Cache
-
Inefficient queries
-
Highly personalized data
-
Writes disguised as reads
Caching is a multiplier, not a fix.
Step 6: Infrastructure Sanity Check
Even well-optimized code fails with misconfigured infrastructure.
Common issues we find:
-
Too many Gunicorn workers exhausting DB connections
-
No connection pooling
-
Missing query timeouts
-
No slow-query alerts
Baseline Setup
-
Gunicorn workers:
(2 × CPU) + 1 -
PgBouncer for connection pooling
-
PostgreSQL
statement_timeout -
Proper health checks and monitoring
Django performance does not stop at application code.
Final Results
After applying the audit:
-
p95 latency: 2.3s → ~410ms
-
Database CPU usage: ↓ 35%
-
Timeouts under load: eliminated
-
Infrastructure cost: unchanged
No rewrite. No new servers. No premature caching.
Just disciplined engineering.
When This Process Is Not Enough
If performance is still unacceptable after this:
-
The domain model may be wrong
-
Async tasks may be misused
-
Or the architecture no longer fits the scale
That’s when deeper refactors or architectural changes are justified.
But most systems never reach that point because they fail at the basics.
How We Use This in Practice
This exact process is how we run production performance audits for Django-based SaaS platforms.
We apply it as a fixed-scope engagement focused on:
-
Identifying real bottlenecks
-
Implementing high-impact fixes
-
Providing a clear performance roadmap
If your Django API feels “mysteriously slow,” this process usually surfaces the cause within days.
About Particel Agency
Particel Agency specializes in performance, scalability, and security for Django-based SaaS platforms.
We work on systems that are already live, already growing, and already under real load.