Optimizing MySQL Pagination for Millions of Rows: Use Seek Method Instead of OFFSET – ITFROMZERO

When I first deployed the users table to production, everything ran smoothly. But once the table crossed 10 million rows, the slow query log started catching SELECT statements with execution times of 3–5 seconds — and they were all LIMIT x OFFSET y pagination queries. That’s when I had to sit down and seriously investigate the problem.

Table of Contents

The Real Problem: The Larger the OFFSET, the Slower the Query

This is a query everyone has written at least once:

-- Page 1
SELECT id, username, email, created_at
FROM users
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;

-- Page 500
SELECT id, username, email, created_at
FROM users
ORDER BY created_at DESC
LIMIT 20 OFFSET 9980;

Page 1 runs in under 10ms. Page 500 takes 2–3 seconds. Page 1000 takes 5–6 seconds. And it doesn’t stop there.

Here’s a benchmark I ran on a real 10 million row table with an index on created_at:

OFFSET 0         →   8ms
OFFSET 100,000   →  420ms
OFFSET 500,000   →  2.1s
OFFSET 1,000,000 →  4.3s

Root Cause Analysis: What MySQL Is Actually Doing Under the Hood

Run EXPLAIN on a query with a large OFFSET to see what’s really happening:

EXPLAIN SELECT id, username, email, created_at
FROM users
ORDER BY created_at DESC
LIMIT 20 OFFSET 500000;

MySQL does not jump directly to row 500,000. Instead, it must execute these steps sequentially:

Scan the created_at index from the beginning
Read 500,020 rows from disk or the buffer pool
Discard the first 500,000 rows
Return the remaining 20 rows

With OFFSET 500,000, you’re forcing MySQL to process half a million rows just to return 20. There’s no shortcut here — this is a design characteristic of standard SQL, not a MySQL bug.

There’s another issue. If the table uses a clustered index on id but sorts by created_at, MySQL must perform a double-lookup: read the secondary index → fetch the primary key → read the row again from the clustered index. Multiply that I/O by 500,000 — and you can see exactly why the query is so slow.

Solutions

Option 1: Late Row Lookup — Quick Win, No API Changes

The idea is simple: split the query into two steps. First, use a subquery to fetch only the id values (very lightweight since it only scans the index), then JOIN to retrieve the full columns:

SELECT u.id, u.username, u.email, u.created_at
FROM users u
INNER JOIN (
    SELECT id FROM users
    ORDER BY created_at DESC
    LIMIT 20 OFFSET 500000
) AS tmp ON u.id = tmp.id;

Instead of reading all the data columns (username, email, etc.) for 500,020 rows, MySQL only scans the index to retrieve 500,020 compact id values, then joins to fetch the actual data. This significantly reduces I/O.

Real-world result: the query above dropped from ~2.1s to ~600ms. An immediate improvement with no changes needed on the client side — you still use page numbers as before.

Option 2: Seek Method (Keyset Pagination) — The Definitive Fix

The Seek Method doesn’t “jump” to a position. It uses the values from the last row of the previous page as a cursor — then queries forward from exactly that point. The logic changes fundamentally:

-- First page (no cursor needed)
SELECT id, username, email, created_at
FROM users
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Next page: pass in the created_at and id of the LAST row from the previous page
-- Assume the last row has: created_at = '2024-03-15 10:30:00', id = 9876543
SELECT id, username, email, created_at
FROM users
WHERE (created_at < '2024-03-15 10:30:00')
   OR (created_at = '2024-03-15 10:30:00' AND id < 9876543)
ORDER BY created_at DESC, id DESC
LIMIT 20;

Create a composite index so MySQL can perform an efficient range scan:

CREATE INDEX idx_users_created_id ON users (created_at DESC, id DESC);

Benchmark comparison on the same 10 million row table:

-- Before (OFFSET)
OFFSET 0         →  8ms
OFFSET 500,000   →  2.1s

-- After (Seek Method)
First page       →  8ms
Page 25,000      →  9ms  ← consistent regardless of page depth!

Why it’s so effective: instead of scanning from the beginning and discarding rows, MySQL uses the WHERE condition to perform an index range scan and reads exactly the 20 rows needed. Query time is nearly constant, independent of pagination depth.

Python Implementation

def get_users_page(db, cursor_created_at=None, cursor_id=None, limit=20):
    """
    cursor_created_at: datetime string of the last row from the previous page
    cursor_id: id of the last row from the previous page
    Returns: {"data": [...], "next_cursor": {...} or None}
    """
    if cursor_created_at is None:
        # First page
        query = """
            SELECT id, username, email, created_at
            FROM users
            ORDER BY created_at DESC, id DESC
            LIMIT %s
        """
        params = (limit,)
    else:
        query = """
            SELECT id, username, email, created_at
            FROM users
            WHERE (created_at < %s)
               OR (created_at = %s AND id < %s)
            ORDER BY created_at DESC, id DESC
            LIMIT %s
        """
        params = (cursor_created_at, cursor_created_at, cursor_id, limit)

    rows = db.execute(query, params)

    # Cursor for the last row — client uses this to fetch the next page
    next_cursor = None
    if rows:
        last = rows[-1]
        next_cursor = {"created_at": str(last["created_at"]), "id": last["id"]}

    return {"data": rows, "next_cursor": next_cursor}

Summary: What to Use and When

I’ve migrated pagination at least three times in production. Here’s what I’ve taken away from those experiences:

Plain OFFSET: Tables under 100,000 rows — no optimization needed, don’t over-engineer it.
Late Row Lookup: Need to keep numbered page navigation, don’t want to change your API contract, and want an immediate improvement today.
Seek Method: Building infinite scroll, a “Load More” button, or a cursor-based REST API. This is the best long-term choice for performance.

Common Pitfalls When Implementing the Seek Method

No arbitrary page jumps: You must navigate sequentially from page 1 → 2 → 3… You can’t skip directly from page 1 to page 50. This suits infinite scroll, not numbered pagination like “Page 1 2 3 … 50”.
The cursor must be unique: If sorting by a column with duplicate values, always include id in your composite cursor. Without this, you’ll miss rows.
More robust than OFFSET with inserts/deletes: OFFSET “drifts” when new rows are inserted between two requests. The Seek Method doesn’t have this problem — a significant advantage for real-time data.
Cache COUNT(*): On large InnoDB tables, SELECT COUNT(*) requires a full scan (unlike MyISAM). Cache this value with a 5–10 minute TTL instead of running it on every request.

-- Cache this result (TTL 5-10 minutes), do NOT re-run on every request
SELECT COUNT(*) FROM users WHERE status = 'active';

Conclusion

LIMIT x OFFSET y is convenient when you’re starting out, but it’s a genuine performance trap once your data grows. I learned this the hard way — after users reported that a list page would just keep loading, never finishing, in the middle of the night.

Designing a new feature? Use the Seek Method from day one and avoid accumulating technical debt. Already using OFFSET-based pagination? Migrate to Late Row Lookup first — it’s the least breaking change and delivers near-instant improvement — then gradually move to cursor-based pagination when you’re ready.