Triển khai Rate Limiting cho REST API Node.js với Redis: Sliding Window và Token Bucket thực chiến – ITFROMZERO

Tháng trước, một API endpoint của mình bị hammer liên tục — gần 2000 requests trong vòng 30 giây từ một IP duy nhất. Server không sập ngay, nhưng response time tăng vọt lên 8 giây và các user thật bắt đầu gặp timeout. Đó là lúc mình nhận ra: thiếu rate limiting là đang để ngỏ một lỗ hổng nghiêm trọng mà không cần exploit phức tạp.

Table of Contents

Ba tình huống thực tế khiến API chết không kịp trở tay

Rate limiting không chỉ để chống DDoS quy mô lớn. Trong thực tế có ba tình huống phổ biến hơn nhiều:

Brute-force login: Kẻ tấn công thử hàng nghìn mật khẩu vào /api/auth/login — không cần botnet, chỉ cần script Python 20 dòng
Scraping dữ liệu: Bot crawl toàn bộ product catalog trong vài phút, đội bandwidth và CPU của bạn
Lạm dụng không cố ý: Một frontend developer trong team gọi API trong vòng lặp vô hạn do bug — mình đã gặp trường hợp này và nó cũng nguy hiểm không kém tấn công có chủ đích

Vì sao In-Memory rate limiting không đủ dùng

Cách đơn giản nhất hầu hết mọi người bắt đầu là express-rate-limit với MemoryStore mặc định:

const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 phút
  max: 100,
  message: 'Quá nhiều requests, thử lại sau'
});

app.use('/api/', limiter);

Đủ dùng cho demo hoặc project một mình, nhưng khi đưa lên production có hai vấn đề nghiêm trọng:

Không chia sẻ state giữa các process: Chạy 4 worker với PM2? Mỗi worker có counter riêng → rate limit thực tế là 100 × 4 = 400 requests
Mất data khi restart: Server restart là mọi counter về 0. Brute-force attack có thể lợi dụng deployment window này

Trong dự án web app gần nhất với 5 developer, mình đã gặp đúng vấn đề này khi chuyển từ single process sang cluster mode. Mất 2 ngày debug mới phát hiện rate limit đang bị nhân lên theo số worker — trong suốt thời gian đó endpoint login gần như không có bảo vệ gì. Từ đó mình chuyển hoàn toàn sang Redis.

Hai thuật toán cần biết trước khi code

Sliding Window — lựa chọn mặc định cho hầu hết endpoint

Thay vì đếm trong cửa sổ cố định (0:00–1:00, 1:00–2:00…), sliding window đếm requests trong N giây vừa qua tính từ thời điểm hiện tại. Không có “boundary attack” — kẻ tấn công không thể gửi 200 requests trong 2 giây bằng cách đứng ở ranh giới window.

Token Bucket — khi cần cho phép burst traffic

Mỗi user có một “bucket” chứa token. Mỗi request tiêu thụ 1 token, token được bổ sung theo tốc độ cố định. Ưu điểm chính: cho phép burst ngắn hạn (user tích lũy token khi ít dùng API, có thể dùng nhiều hơn trong thời gian ngắn). Phù hợp cho endpoint upload file hoặc API public có subscription tier.

Triển khai Sliding Window với Redis

Cài dependencies:

npm install express ioredis express-rate-limit rate-limit-redis

Setup Redis client và rate limiter:

const express = require('express');
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');
const Redis = require('ioredis');

const app = express();

const redisClient = new Redis({
  host: process.env.REDIS_HOST || 'localhost',
  port: process.env.REDIS_PORT || 6379,
  password: process.env.REDIS_PASSWORD,
  enableOfflineQueue: false, // Không queue khi Redis down
});

// Rate limiter chung — 60 requests/phút
const apiLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 60,
  standardHeaders: true,  // Trả về RateLimit-* headers
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
  }),
  handler: (req, res) => {
    res.status(429).json({
      error: 'Too Many Requests',
      retryAfter: Math.ceil(req.rateLimit.resetTime / 1000),
    });
  },
});

// Rate limiter nghiêm hơn cho auth — chỉ 10 lần thử/15 phút
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 10,
  skipSuccessfulRequests: true, // Không đếm login thành công
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
    prefix: 'auth_rl:', // prefix riêng để tránh conflict
  }),
});

app.use('/api/', apiLimiter);
app.use('/api/auth/login', authLimiter);

Triển khai Token Bucket với Redis Lua Script

Token Bucket cần đảm bảo atomicity khi đọc-ghi Redis. Lua script chạy như một transaction — không có race condition:

const TOKEN_BUCKET_SCRIPT = `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])  -- token/giây
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now

local elapsed = now - last_refill
local new_tokens = math.min(capacity, tokens + elapsed * refill_rate)

if new_tokens >= requested then
  redis.call('HMSET', key, 'tokens', new_tokens - requested, 'last_refill', now)
  redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 1)
  return 1
else
  redis.call('HMSET', key, 'tokens', new_tokens, 'last_refill', now)
  redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 1)
  return 0
end
`;

function tokenBucketMiddleware(capacity, refillRate) {
  return async (req, res, next) => {
    const key = `tb:${req.user?.id || req.ip}`;
    const now = Date.now() / 1000;

    try {
      const allowed = await redisClient.eval(
        TOKEN_BUCKET_SCRIPT,
        1, key, capacity, refillRate, now, 1
      );

      if (allowed === 1) {
        next();
      } else {
        res.status(429).json({ error: 'Rate limit exceeded' });
      }
    } catch (err) {
      // Redis lỗi → cho qua để không block user thật
      console.error('Token bucket error:', err);
      next();
    }
  };
}

// Upload: 10 token tích lũy, bổ sung 0.1 token/giây (= 6 lần/phút)
app.use('/api/upload', tokenBucketMiddleware(10, 0.1));

Rate Limiting theo User ID thay vì IP

Rate limit theo IP dễ bypass bằng VPN hoặc proxy. Với API có authentication, kết hợp cả hai:

const userAwareLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  keyGenerator: (req) => {
    // Đã đăng nhập → dùng user ID, không thể thay đổi
    // Anonymous → fallback về IP
    return req.user?.id ? `user:${req.user.id}` : `ip:${req.ip}`;
  },
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
  }),
});

Debug và monitor trực tiếp trên Redis

# Xem tất cả rate limit keys
redis-cli KEYS "rl:*"

# Counter hiện tại của một IP
redis-cli GET "rl:192.168.1.100"

# Còn bao lâu nữa window reset
redis-cli TTL "rl:192.168.1.100"

# Xóa rate limit cho một IP (khi cần whitelist tạm thời)
redis-cli DEL "rl:192.168.1.100"

Log khi có request bị chặn để phân tích pattern tấn công:

app.use((req, res, next) => {
  res.on('finish', () => {
    if (res.statusCode === 429) {
      console.warn(`[RATE_LIMIT] ${req.ip} → ${req.method} ${req.path} | User: ${req.user?.id || 'anonymous'}`);
    }
  });
  next();
});

Checklist trước khi deploy lên production

Redis client có enableOfflineQueue: false không? Nếu Redis down, middleware phải next() thay vì treo request
Endpoint /auth/login có limit riêng nghiêm hơn API thường không?
Có standardHeaders: true để client biết còn bao nhiêu request và khi nào reset không?
Load balancer hoặc health check internal có bị ảnh hưởng bởi rate limit không? Nếu có, whitelist IP nội bộ
Rate limit key có prefix rõ ràng không? Tránh collision giữa các endpoint khác nhau

Một điều mình học được sau vài lần deploy sai: đừng set limit quá chặt ngay từ đầu. Chạy với limit thoải mái, monitor log một tuần để xem traffic pattern thực tế của user, rồi mới điều chỉnh. Set quá chặt từ đầu sẽ block cả user thật và bạn sẽ mất buổi tối xử lý complaint thay vì làm việc khác.