Every API eventually needs to return more data than fits in a single response. How you handle that pagination affects performance, reliability, and developer experience.

Let’s look at the common patterns, their tradeoffs, and when to use each.

The Three Main Approaches

1. Offset Pagination

The classic approach: skip N records, return M records.

GGGEEETTT///aaapppiii///iiittteeemmmsss???oooffffffssseeettt===024&00l&&illmiiimmtii=tt2==02200###PPPaaagggeee123

Implementation:

1
2
3
SELECT * FROM items 
ORDER BY created_at DESC 
LIMIT 20 OFFSET 40;

Response:

1
2
3
4
5
6
7
8
{
  "items": [...],
  "pagination": {
    "offset": 40,
    "limit": 20,
    "total": 1543
  }
}

Pros:

  • Simple to understand and implement
  • Clients can jump to any page directly
  • Easy to show “Page X of Y” UI

Cons:

  • Performance degrades with large offsets (database must scan and skip rows)
  • Inconsistent results if data changes between requests
  • OFFSET 100000 is painfully slow

When offset fails:

1
2
3
-- This gets slower as offset increases
SELECT * FROM items ORDER BY id LIMIT 20 OFFSET 1000000;
-- Database must scan 1,000,020 rows to return 20

2. Cursor Pagination

Return an opaque cursor pointing to the last item. Next request starts from there.

GGEETT/R/Raeaeptptiuiu/r/rinintstseemimnstse?e?xlmltisim,m2ii0tct=u=i2r2t0s0eo&mrcs:u,r"sceouyrrJ=speoZyrCJ:Ip6Z"MCeTIyI6JzMpfTZQIC=zI=f6"QM=T=QzfQ=="

Implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Decode cursor (base64-encoded JSON)
cursor_data = decode_cursor(cursor)  # {"id": 123}

# Query with WHERE instead of OFFSET
items = db.query("""
    SELECT * FROM items 
    WHERE id < :last_id 
    ORDER BY id DESC 
    LIMIT 20
""", last_id=cursor_data["id"])

# Encode next cursor
next_cursor = encode_cursor({"id": items[-1].id})

Response:

1
2
3
4
5
6
7
{
  "items": [...],
  "pagination": {
    "next_cursor": "eyJpZCI6MTQzfQ==",
    "has_more": true
  }
}

Pros:

  • Consistent performance regardless of position in dataset
  • Stable results even if data changes (no missed/duplicate items)
  • Works well with infinite scroll

Cons:

  • Can’t jump to arbitrary pages
  • More complex to implement
  • Cursor must encode enough state to resume

3. Keyset Pagination (Seek Method)

Like cursor pagination, but with explicit values instead of opaque tokens.

GGGEEETTT///aaapppiii///iiittteeemmmsss???llliiimmmiiittt===222000&&aafftteerr__iidd==112433

Implementation:

1
2
3
4
SELECT * FROM items 
WHERE id < 123 
ORDER BY id DESC 
LIMIT 20;

Pros:

  • Same performance benefits as cursor pagination
  • Transparent to clients (no opaque tokens)
  • Bookmarkable URLs

Cons:

  • Exposes internal IDs
  • Complex for multi-column sorting
  • Clients must understand the keyset structure

Multi-Column Sorting

The tricky case: paginating when sorting by non-unique columns.

GET/api/items?sort=created_at&limit=20

Problem: if multiple items have the same created_at, you can’t use it alone as a cursor.

Solution: Compound cursor

1
2
3
4
5
-- Cursor: (created_at, id)
SELECT * FROM items 
WHERE (created_at, id) < ('2024-01-15 10:30:00', 456)
ORDER BY created_at DESC, id DESC
LIMIT 20;
1
2
3
4
5
{
  "pagination": {
    "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNC0wMS0xNVQxMDozMDowMFoiLCJpZCI6NDU2fQ=="
  }
}

The cursor encodes both values. This ensures stable pagination even with duplicate sort values.

Handling Deletions and Insertions

The Offset Problem

With offset pagination, data changes cause issues:

P#PaagIgetee1m2::5IItgteeemtmsss1d2-e02l-0e3t9ed(Item21wasskipped!)

Or with insertions:

P#PaagNgeeew12:i:tIeItmteemimsnss1e2-r12t-0e4d0a(tItpeomsi2t0ioanpp1earsagain!)

The Cursor Solution

Cursor pagination handles this gracefully:

F#S#ierIcCstootennmdsrie5rsqetugqeeeunstettss,:td:nReoelRteeduttureupndrlnii-ctiaedttmoeesemssswnoh'wrethregearafepifsdeicd>t>0c,u2r0csuorrsoprospiotiinotnstoid=20

Performance Optimization

Index Your Sort Columns

1
2
-- For: ORDER BY created_at DESC, id DESC
CREATE INDEX idx_items_created_id ON items(created_at DESC, id DESC);

Without this index, every pagination query requires a full table scan and sort.

Include Columns in the Index

1
2
3
-- Covering index: no table lookup needed
CREATE INDEX idx_items_pagination ON items(created_at DESC, id DESC) 
INCLUDE (title, status);

Set Reasonable Limits

1
2
3
4
5
MAX_PAGE_SIZE = 100

def get_items(limit: int = 20):
    limit = min(limit, MAX_PAGE_SIZE)
    # ...

Don’t let clients request 10,000 items at once.

Consider Total Count Carefully

1
2
-- This can be expensive on large tables
SELECT COUNT(*) FROM items WHERE status = 'active';

Options:

  • Cache the count, update periodically
  • Return estimated count
  • Don’t return total (just has_more)
  • Use COUNT(*) only on first page

API Design Recommendations

Response Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "data": [...],
  "pagination": {
    "next_cursor": "abc123",
    "has_more": true
  },
  "meta": {
    "total_count": 1543,  // Optional, can be expensive
    "page_size": 20
  }
}

Or with links (HATEOAS style):

1
2
3
4
5
6
7
8
{
  "data": [...],
  "links": {
    "self": "/api/items?cursor=abc",
    "next": "/api/items?cursor=def",
    "prev": "/api/items?cursor=xyz"
  }
}

Cursor Design

Make cursors:

  • Opaque: Don’t encourage clients to parse or construct them
  • Stable: Same cursor always returns same position
  • Secure: Sign or encrypt if they contain sensitive data
  • Versionable: Include version info if format might change
1
2
3
4
def encode_cursor(data: dict) -> str:
    payload = json.dumps(data)
    signature = hmac.new(SECRET, payload.encode(), 'sha256').hexdigest()[:8]
    return base64.urlsafe_b64encode(f"{signature}:{payload}".encode()).decode()

Error Handling

1
2
3
4
5
6
// Invalid cursor
{
  "error": "invalid_cursor",
  "message": "The provided cursor is invalid or expired",
  "hint": "Start from the beginning without a cursor"
}

Cursors can become invalid if:

  • Data was deleted
  • Cursor format changed
  • Cursor expired (if you enforce TTL)

Always handle this gracefully.

Choosing the Right Pattern

PatternBest ForAvoid When
OffsetAdmin UIs, small datasets, jump-to-page neededLarge datasets, real-time data
CursorInfinite scroll, large datasets, mobile appsNeed to jump to arbitrary pages
KeysetPublic APIs, bookmarkable URLsComplex sorting, sensitive IDs

Hybrid approach: Support both for flexibility:

GGEETT//aappii//iitteemmss??pcaugres=o5r&=paebrc_1p2a3g&el=i2m0it=20##OfCfusrestormomdoede

Document the tradeoffs and let clients choose.

Common Mistakes

  1. Using offset for large datasets: Performance cliff at scale
  2. Not indexing sort columns: Every query becomes a full scan
  3. Expensive COUNT(*) on every request: Cache or skip it
  4. Exposing internal IDs unnecessarily: Use opaque cursors
  5. No max page size: Clients requesting millions of rows
  6. Ignoring consistency: Offset pagination + real-time data = bugs

Pagination seems simple until you have a million rows and users complaining about slow pages. Choose cursor or keyset pagination by default, use offset only when you genuinely need random access, and always index your sort columns.