Every API eventually needs to return more data than fits in a single response. How you handle that pagination affects performance, reliability, and developer experience.
Let’s look at the common patterns, their tradeoffs, and when to use each.
The Three Main Approaches
1. Offset Pagination
The classic approach: skip N records, return M records.
Implementation:
| |
Response:
| |
Pros:
- Simple to understand and implement
- Clients can jump to any page directly
- Easy to show “Page X of Y” UI
Cons:
- Performance degrades with large offsets (database must scan and skip rows)
- Inconsistent results if data changes between requests
OFFSET 100000is painfully slow
When offset fails:
| |
2. Cursor Pagination
Return an opaque cursor pointing to the last item. Next request starts from there.
Implementation:
| |
Response:
| |
Pros:
- Consistent performance regardless of position in dataset
- Stable results even if data changes (no missed/duplicate items)
- Works well with infinite scroll
Cons:
- Can’t jump to arbitrary pages
- More complex to implement
- Cursor must encode enough state to resume
3. Keyset Pagination (Seek Method)
Like cursor pagination, but with explicit values instead of opaque tokens.
Implementation:
| |
Pros:
- Same performance benefits as cursor pagination
- Transparent to clients (no opaque tokens)
- Bookmarkable URLs
Cons:
- Exposes internal IDs
- Complex for multi-column sorting
- Clients must understand the keyset structure
Multi-Column Sorting
The tricky case: paginating when sorting by non-unique columns.
Problem: if multiple items have the same created_at, you can’t use it alone as a cursor.
Solution: Compound cursor
| |
| |
The cursor encodes both values. This ensures stable pagination even with duplicate sort values.
Handling Deletions and Insertions
The Offset Problem
With offset pagination, data changes cause issues:
Or with insertions:
The Cursor Solution
Cursor pagination handles this gracefully:
Performance Optimization
Index Your Sort Columns
| |
Without this index, every pagination query requires a full table scan and sort.
Include Columns in the Index
| |
Set Reasonable Limits
| |
Don’t let clients request 10,000 items at once.
Consider Total Count Carefully
| |
Options:
- Cache the count, update periodically
- Return estimated count
- Don’t return total (just
has_more) - Use
COUNT(*)only on first page
API Design Recommendations
Response Structure
| |
Or with links (HATEOAS style):
| |
Cursor Design
Make cursors:
- Opaque: Don’t encourage clients to parse or construct them
- Stable: Same cursor always returns same position
- Secure: Sign or encrypt if they contain sensitive data
- Versionable: Include version info if format might change
| |
Error Handling
| |
Cursors can become invalid if:
- Data was deleted
- Cursor format changed
- Cursor expired (if you enforce TTL)
Always handle this gracefully.
Choosing the Right Pattern
| Pattern | Best For | Avoid When |
|---|---|---|
| Offset | Admin UIs, small datasets, jump-to-page needed | Large datasets, real-time data |
| Cursor | Infinite scroll, large datasets, mobile apps | Need to jump to arbitrary pages |
| Keyset | Public APIs, bookmarkable URLs | Complex sorting, sensitive IDs |
Hybrid approach: Support both for flexibility:
Document the tradeoffs and let clients choose.
Common Mistakes
- Using offset for large datasets: Performance cliff at scale
- Not indexing sort columns: Every query becomes a full scan
- Expensive COUNT(*) on every request: Cache or skip it
- Exposing internal IDs unnecessarily: Use opaque cursors
- No max page size: Clients requesting millions of rows
- Ignoring consistency: Offset pagination + real-time data = bugs
Pagination seems simple until you have a million rows and users complaining about slow pages. Choose cursor or keyset pagination by default, use offset only when you genuinely need random access, and always index your sort columns.