QuestDB: Handling 4 Million Records/Sec and Ending Midnight Database Crash Anxiety

Database tutorial - IT technology blog
Database tutorial - IT technology blog

The 2 AM Savior: When PostgreSQL and MySQL “Surrender” to IoT Data

A phone vibrating violently at 2 AM is every operations engineer’s nightmare. Grafana dashboards go blank, server CPU hits 100%, and IOPS max out. The monitoring system is officially paralyzed.

I learned this the hard way during a factory sensor monitoring project. Initially, the team chose PostgreSQL due to familiarity. However, as the scale grew to 5,000 sensors pushing over 100,000 records per second, PostgreSQL began to choke. Indexes ballooned to hundreds of GBs, INSERT statements hit bottlenecks, and time-range queries took minutes to respond.

In reality, MySQL or MongoDB are excellent for standard operations. But for Time-series data requiring write speeds of millions of rows per second, they are no match for QuestDB. QuestDB is aggressively optimized: it uses columnar storage, leverages CPU SIMD instructions, and supports standard SQL.

Quick Start: Running QuestDB in 5 Minutes

Don’t waste time on tedious configuration steps. The fastest way to deploy is using Docker. QuestDB is extremely lightweight and doesn’t pull in dozens of dependencies.

docker run -p 9000:9000 -p 8812:8812 -p 9009:9009 -p 9003:9003 \
  questdb/questdb

Here are the ports you should keep in mind:

  • 9000: Web Console. Access localhost:9000 to manage the database via a visual interface.
  • 8812: Postgres Wire Protocol. This port allows you to use PostgreSQL libraries to connect.
  • 9009: InfluxDB Line Protocol. This is the “expressway” for pushing data at maximum speed.

Why is QuestDB Ridiculously Fast?

The difference lies in the storage engine. Traditional databases like MySQL use row-based storage. To calculate the average temperature of 1 billion records, Postgres must scan through all unrelated columns like sensor_id or location. This causes massive resource waste.

QuestDB takes a different path. It uses columnar storage. When calculating temperature, the system only reads that specific column from the disk. Combined with time-partitioning, QuestDB completely eliminates redundant data scans.

In a real-world test with 100 million rows of data, an AVG calculation on QuestDB completed in just a few hundred milliseconds. This is 15 times faster than PostgreSQL without any complex tuning.

Ingesting Data: Prioritize InfluxDB Line Protocol (ILP)

While QuestDB supports SQL INSERT, it’s not the optimal way for IoT. To achieve peak performance, you should use InfluxDB Line Protocol (ILP). This protocol pushes data directly into RAM before flushing it to disk periodically.

The Python code below simulates pushing sensor data via a socket:

import socket
import time

HOST, PORT = 'localhost', 9009

def send_sensor_data():
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.connect((HOST, PORT))
        while True:
            # Structure: table_name,tags fields timestamp
            data = f"sensors,sensor_id=S001 temperature=25.5,humidity=60.2 {time.time_ns()}\n"
            s.sendall(data.encode())
            time.sleep(0.01) # Send 100 records per second

send_sensor_data()

A huge plus is that QuestDB is schema-less. It automatically creates tables and adds new columns as soon as it receives data for the first time, saving you database design time.

Smart Queries with Extended SQL

QuestDB doesn’t force you to learn a new language like Flux. It uses standard SQL but adds “worth-every-penny” keywords for Time-series.

1. SAMPLE BY: Time grouping made easy

Forget complex date-time functions in GROUP BY. With QuestDB, you only need one line:

SELECT timestamp, avg(temperature)
FROM sensors
SAMPLE BY 1h; -- Automatically group averages by hour

2. LATEST BY: Get the latest state instantly

To find the current status of all sensors, traditional SQL requires heavy JOIN and MAX(timestamp) operations. QuestDB handles this in a flash:

SELECT * FROM sensors LATEST BY sensor_id;

3. ASOF JOIN: Join out-of-sync data

This is a powerful feature for finance and IoT. When the prices and trades tables don’t match exactly by millisecond, ASOF JOIN automatically picks the nearest previous value to match the record.

Hard-Learned Lessons from Real-World Deployment

Despite its power, QuestDB has its own rules that you must follow to avoid system hangs:

  • Partitioning is mandatory: Always declare PARTITION BY DAY or MONTH when creating tables. This allows data retention (deleting old data) to happen instantly without locking the table.
  • Manage Out-of-order data: QuestDB performs best when data is sent in chronological order. If data lags significantly, adjust the cairo.max.uncommitted.rows parameter to optimize the buffer.
  • Hardware: Invest in SSDs or NVMe. QuestDB performs a lot of sequential writes; disk speed is the key factor for throughput.
  • Limit SELECT *: For tables with billions of rows, running SELECT * without a LIMIT on the Web Console will freeze your browser due to data display overhead.

Summary

QuestDB wasn’t born to completely replace PostgreSQL for financial transactions requiring strict ACID compliance. However, if you’re struggling with infrastructure monitoring or tracking millions of IoT devices, this is the lifesaver. Instead of staying up all night optimizing indexes, let QuestDB handle the heavy lifting so you can sleep better.

Share: