Scaling PostgreSQL ‘Unlimited’ with Citus: From 1 Billion to Tens of Billions of Records – ITFROMZERO

Table of Contents

When is one PostgreSQL server not enough?

I once directly managed an ad tracking log system with staggering growth. Every day, the database had to ingest about 20-30 million new records. Initially, the team chose Vertical Scaling: upgrading RAM to 128GB and using the best NVMe SSDs available at the time. However, when the events table hit the 2 billion row mark, things started to get ugly. Simple COUNT statements or report JOINs took several minutes to complete, leaving the CPU constantly pegged at 100%.

In reality, hardware upgrades always have a limit, and costs increase exponentially. The most viable option at that point is Horizontal Sharding. Instead of cramming everything into one massive server, I split the data across 5, 10, or 20 smaller servers. Citus is the most powerful tool to turn PostgreSQL into a distributed database without forcing you to rewrite your entire application logic.

What is Citus and why is it so highly trusted?

Citus is not a fork of PostgreSQL. It is actually an extension. The key value here is that you get to keep the entire Postgres ecosystem: from JSONB and Full-text search to other extensions like PostGIS or TimescaleDB.

The Citus model operates based on two main components:

Coordinator Node: Acts as the “brain.” It stores metadata, receives queries from the application, analyzes them, and coordinates commands to the Workers.
Worker Nodes: The actual “workers.” This is where data shards are stored and parallel computations are performed.

When you send a query, the Coordinator breaks it down and pushes it to the Workers to run simultaneously. The results are then aggregated and returned to you. The experience feels like using a single server, but the actual power is the combined strength of the entire cluster.

Quickly setting up a Citus Cluster with Docker

To get started right away, you can set up a Citus cluster with 1 Coordinator and 2 Workers via Docker Compose. This is the fastest way to test before considering production deployment.

# Sample docker-compose.yml file
version: '3'
services:
  db_master:
    image: citusdata/citus:12.1
    ports: ["5432:5432"]
    environment: &id001
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: secret

  worker_1:
    image: citusdata/citus:12.1
    environment: *id001
    depends_on: [db_master]

  worker_2:
    image: citusdata/citus:12.1
    environment: *id001
    depends_on: [db_master]

After launching with the docker-compose up -d command, access the Coordinator node to register the Workers into the system:

-- Connect to the Coordinator and add nodes
SELECT citus_add_node('worker_1', 5432);
SELECT citus_add_node('worker_2', 5432);

-- Check the status of the nodes
SELECT * FROM citus_get_active_worker_nodes();

Sharding Strategy: Choosing the Right Key for Speed

The most important decision when using Citus is choosing the Distribution Column (Shard Key). If you choose poorly, you’ll end up with data skew, leaving one server overloaded while others sit idle.

For the user_activities table, I usually choose user_id as the shard key. This is because most real-world queries filter by a specific user. This way, all of a user’s data resides on the same worker, allowing JOINs or Aggregations to happen extremely fast without network communication between nodes.

-- Create the table on the Coordinator
CREATE TABLE user_activities (
    id bigserial,
    user_id int NOT NULL,
    action text,
    created_at timestamp DEFAULT now(),
    PRIMARY KEY (user_id, id) -- The shard key must be part of the Primary Key
);

-- Enable table distribution
SELECT create_distributed_table('user_activities', 'user_id');

The system will automatically split the table into 32 shards (by default) and distribute them evenly across the available Workers.

Pro Tip: Leverage Reference Tables

Databases always have lookup tables that rarely change, like categories or countries. If you shard these as well, JOINing data across Workers will be extremely resource-intensive for the network.

The optimal solution is to use a Reference Table. Citus will copy this entire table to ALL workers to facilitate local JOINs.

CREATE TABLE categories (id int PRIMARY KEY, name text);

-- Replicate this table to every node in the cluster
SELECT create_reference_table('categories');

Monitoring and Debugging

To see where the data actually resides, you can inspect Citus metadata with the following command:

SELECT shardid, shardsize, nodename 
FROM citus_shards 
WHERE table_name::text = 'user_activities';

A feature I really like is the EXPLAIN ANALYZE capability. When running this on the Coordinator, you’ll see details for each task sent to which Worker and how many milliseconds it took to execute.

EXPLAIN (VERBOSE ON) 
SELECT count(*) FROM user_activities WHERE user_id = 1001;

If the result shows Custom Scan (Citus Adaptive) and targets only a single node, congratulations, you’ve chosen a perfect Shard Key!

Three “Hard-Learned” Lessons from Real-World Operations

After a long time “battling” with Citus in production, I’ve gathered 3 important notes:

Unique Constraints: Citus only guarantees uniqueness if the shard key is part of that constraint. You cannot create a Unique Index on email if you are sharding by user_id.
Distributed Transactions: While Citus supports them well, long-running transactions touching multiple shards are prone to deadlocks. Try to design queries that stay within a single shard key.
Hardware Homogeneity: Never mix servers with different performance levels. The cluster’s speed will be dragged down by the slowest node (Straggler effect).

Switching to a distributed architecture isn’t technically difficult; the challenge lies in changing your schema design mindset. If you master the Shard Key, managing tens of terabytes of data on Postgres becomes easier than ever.

Are you struggling with slow Postgres performance? Leave a comment below, and let’s find a solution together!