API Pagination Patterns: Getting Large Result Sets Right

Every API eventually needs to return more data than fits in a single response. How you handle that pagination affects performance, reliability, and developer experience.

Let’s look at the common patterns, their tradeoffs, and when to use each.

The Three Main Approaches

1. Offset Pagination

The classic approach: skip N records, return M records.

Implementation:

1
2
3
SELECT * FROM items 
ORDER BY created_at DESC 
LIMIT 20 OFFSET 40;

Response:

1
2
3
4
5
6
7
8
{
  "items": [...],
  "pagination": {
    "offset": 40,
    "limit": 20,
    "total": 1543
  }
}

Pros:

Simple to understand and implement
Clients can jump to any page directly
Easy to show “Page X of Y” UI

Cons:

Performance degrades with large offsets (database must scan and skip rows)
Inconsistent results if data changes between requests
OFFSET 100000 is painfully slow

When offset fails:

1
2
3
-- This gets slower as offset increases
SELECT * FROM items ORDER BY id LIMIT 20 OFFSET 1000000;
-- Database must scan 1,000,020 rows to return 20

2. Cursor Pagination

Return an opaque cursor pointing to the last item. Next request starts from there.

Implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Decode cursor (base64-encoded JSON)
cursor_data = decode_cursor(cursor)  # {"id": 123}

# Query with WHERE instead of OFFSET
items = db.query("""
    SELECT * FROM items 
    WHERE id < :last_id 
    ORDER BY id DESC 
    LIMIT 20
""", last_id=cursor_data["id"])

# Encode next cursor
next_cursor = encode_cursor({"id": items[-1].id})

Response:

1
2
3
4
5
6
7
{
  "items": [...],
  "pagination": {
    "next_cursor": "eyJpZCI6MTQzfQ==",
    "has_more": true
  }
}

Pros:

Consistent performance regardless of position in dataset
Stable results even if data changes (no missed/duplicate items)
Works well with infinite scroll

Cons:

Can’t jump to arbitrary pages
More complex to implement
Cursor must encode enough state to resume

3. Keyset Pagination (Seek Method)

Like cursor pagination, but with explicit values instead of opaque tokens.

Implementation:

1
2
3
4
SELECT * FROM items 
WHERE id < 123 
ORDER BY id DESC 
LIMIT 20;

Pros:

Same performance benefits as cursor pagination
Transparent to clients (no opaque tokens)
Bookmarkable URLs

Cons:

Exposes internal IDs
Complex for multi-column sorting
Clients must understand the keyset structure

Multi-Column Sorting

The tricky case: paginating when sorting by non-unique columns.

Problem: if multiple items have the same created_at, you can’t use it alone as a cursor.

Solution: Compound cursor

1
2
3
4
5
-- Cursor: (created_at, id)
SELECT * FROM items 
WHERE (created_at, id) < ('2024-01-15 10:30:00', 456)
ORDER BY created_at DESC, id DESC
LIMIT 20;

1
2
3
4
5
{
  "pagination": {
    "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNC0wMS0xNVQxMDozMDowMFoiLCJpZCI6NDU2fQ=="
  }
}

The cursor encodes both values. This ensures stable pagination even with duplicate sort values.

Handling Deletions and Insertions

The Offset Problem

With offset pagination, data changes cause issues:

Or with insertions:

The Cursor Solution

Cursor pagination handles this gracefully:

Performance Optimization

Index Your Sort Columns

1
2
-- For: ORDER BY created_at DESC, id DESC
CREATE INDEX idx_items_created_id ON items(created_at DESC, id DESC);

Without this index, every pagination query requires a full table scan and sort.

Include Columns in the Index

1
2
3
-- Covering index: no table lookup needed
CREATE INDEX idx_items_pagination ON items(created_at DESC, id DESC) 
INCLUDE (title, status);

Set Reasonable Limits

1
2
3
4
5
MAX_PAGE_SIZE = 100

def get_items(limit: int = 20):
    limit = min(limit, MAX_PAGE_SIZE)
    # ...

Don’t let clients request 10,000 items at once.

Consider Total Count Carefully

1
2
-- This can be expensive on large tables
SELECT COUNT(*) FROM items WHERE status = 'active';

Options:

Cache the count, update periodically
Return estimated count
Don’t return total (just has_more)
Use COUNT(*) only on first page

API Design Recommendations

Response Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "data": [...],
  "pagination": {
    "next_cursor": "abc123",
    "has_more": true
  },
  "meta": {
    "total_count": 1543,  // Optional, can be expensive
    "page_size": 20
  }
}

Or with links (HATEOAS style):

1
2
3
4
5
6
7
8
{
  "data": [...],
  "links": {
    "self": "/api/items?cursor=abc",
    "next": "/api/items?cursor=def",
    "prev": "/api/items?cursor=xyz"
  }
}

Cursor Design

Make cursors:

Opaque: Don’t encourage clients to parse or construct them
Stable: Same cursor always returns same position
Secure: Sign or encrypt if they contain sensitive data
Versionable: Include version info if format might change

1
2
3
4
def encode_cursor(data: dict) -> str:
    payload = json.dumps(data)
    signature = hmac.new(SECRET, payload.encode(), 'sha256').hexdigest()[:8]
    return base64.urlsafe_b64encode(f"{signature}:{payload}".encode()).decode()

Error Handling

1
2
3
4
5
6
// Invalid cursor
{
  "error": "invalid_cursor",
  "message": "The provided cursor is invalid or expired",
  "hint": "Start from the beginning without a cursor"
}

Cursors can become invalid if:

Data was deleted
Cursor format changed
Cursor expired (if you enforce TTL)

Always handle this gracefully.

Choosing the Right Pattern

Pattern	Best For	Avoid When
Offset	Admin UIs, small datasets, jump-to-page needed	Large datasets, real-time data
Cursor	Infinite scroll, large datasets, mobile apps	Need to jump to arbitrary pages
Keyset	Public APIs, bookmarkable URLs	Complex sorting, sensitive IDs

Hybrid approach: Support both for flexibility:

Document the tradeoffs and let clients choose.

Common Mistakes

Using offset for large datasets: Performance cliff at scale
Not indexing sort columns: Every query becomes a full scan
Expensive COUNT(*) on every request: Cache or skip it
Exposing internal IDs unnecessarily: Use opaque cursors
No max page size: Clients requesting millions of rows
Ignoring consistency: Offset pagination + real-time data = bugs

Pagination seems simple until you have a million rows and users complaining about slow pages. Choose cursor or keyset pagination by default, use offset only when you genuinely need random access, and always index your sort columns.

The Three Main Approaches#

1. Offset Pagination#

2. Cursor Pagination#

3. Keyset Pagination (Seek Method)#

Multi-Column Sorting#

Handling Deletions and Insertions#

The Offset Problem#

The Cursor Solution#

Performance Optimization#

Index Your Sort Columns#

Include Columns in the Index#

Set Reasonable Limits#

Consider Total Count Carefully#

API Design Recommendations#

Response Structure#

Cursor Design#

Error Handling#

Choosing the Right Pattern#

Common Mistakes#

📬 Get the Newsletter

The Three Main Approaches

1. Offset Pagination

2. Cursor Pagination

3. Keyset Pagination (Seek Method)

Multi-Column Sorting

Handling Deletions and Insertions

The Offset Problem

The Cursor Solution

Performance Optimization

Index Your Sort Columns

Include Columns in the Index

Set Reasonable Limits

Consider Total Count Carefully

API Design Recommendations

Response Structure

Cursor Design

Error Handling

Choosing the Right Pattern

Common Mistakes