Multi-tenant Architecture in MySQL: From Hard-Learned Lessons to Scaling SaaS for Thousands of Users

MySQL tutorial - IT technology blog
MySQL tutorial - IT technology blog

The Challenge of Data Isolation in SaaS

Back when I worked on my first SaaS project for sales management, I had an unforgettable experience. To deploy quickly, I shoved all the data for every store into a single database. Tables were distinguished by a tenant_id column.

Everything ran smoothly until the system hit the 500-customer mark. One day, a teammate forgot a WHERE tenant_id = ? condition while running an UPDATE command on product prices. In just one second, all of Customer A’s orders were changed to Customer B’s name. I had to stay up all night to restore 200GB of data. The lesson learned: If you don’t choose the right Multi-tenant architecture from the start, you’re building on quicksand.

Why is Multi-tenancy Such a Headache?

In the SaaS world, each customer is a “Tenant”. The biggest challenge isn’t just preventing Tenant A from peeking at Tenant B’s data. You also have to balance three factors: absolute security, maintainability, and operational costs (CPU, RAM).

Experience shows there is no perfect architecture for every case. Whether you’re building an app for 10 large enterprise clients or 10,000 small shops, your approach will differ significantly.

3 Practical Data Isolation Methods in MySQL

1. Database-per-Tenant (One database per customer)

This method prioritizes isolation. Every time a new customer signs up, the system automatically executes a CREATE DATABASE db_tenant_id command.

  • Pros: Highest level of security. You can perform individual backups for each customer upon request. If Tenant A’s database crashes, Tenant B remains unaffected.
  • Cons: Extremely resource-heavy. MySQL often faces issues with buffer pool management (InnoDB Buffer Pool) and file descriptor limits when the number of databases exceeds 1,000. Running migrations across 1,000 databases simultaneously is also a major time-consuming challenge.

2. Schema-per-Tenant (Shared Instance, Different Schemas)

In MySQL, DATABASE and SCHEMA are practically the same thing. However, in systems like PostgreSQL, they are more distinctly separated. In MySQL, this method is often implemented by using a single Database Instance but dividing data using Table Prefixes or logical clusters.

3. Shared Database, Shared Schema (Everything Shared)

Startups often choose this approach to save on server costs. All tenants share the same set of tables, distinguished by an identifier column.

CREATE TABLE orders (
    id INT AUTO_INCREMENT PRIMARY KEY,
    tenant_id INT NOT NULL,
    customer_name VARCHAR(255),
    total_amount DECIMAL(10, 2),
    INDEX (tenant_id) -- An index is mandatory to avoid full table scans
);

Performance Note: When the orders table exceeds 20-30 million rows, query speeds will noticeably drop. At this point, a single index on tenant_id isn’t enough. You’ll need Composite Indexes for optimization.

Solutions to Prevent ‘Forgetting’ the WHERE Clause

To avoid the silly mistake I made, you can apply these two techniques:

Technique 1: Using MySQL Views

You create a separate View for each customer. The backend code connects directly to the View instead of the base table. This is secure but can lead to a bloated list of Views in the database, making manual management difficult.

Technique 2: Simulating Row-Level Security (RLS) with Session Variables

While MySQL doesn’t have native RLS like Postgres, we can “bypass” this by setting a session variable as soon as the connection opens:

SET @current_tenant_id = 101;

Then, query through a common View with automatic filtering logic:

CREATE VIEW v_tenant_orders AS
SELECT * FROM orders WHERE tenant_id = @current_tenant_id;

This approach keeps the backend code extremely clean. You just call SELECT * FROM v_tenant_orders without ever worrying about cross-tenant data leakage.

Optimization Tips for Growing Systems

As data grows, the Noisy Neighbor phenomenon (where one customer uses too many resources, affecting others) will appear. Here’s how I handle it:

  • Composite Index: Always prioritize tenant_id at the start of a composite index, for example: (tenant_id, status, created_at).
  • Table Partitioning: Use PARTITION BY LIST(tenant_id). MySQL will only scan the specific partition containing that tenant’s data instead of scanning a whole 50GB table.
  • Database Sharding: When a server consistently hits 80% CPU, split tenants across different physical servers. Group 1-500 on Server A, group 501-1000 on Server B.

Which Path Should You Choose for Your Project?

Based on my experience, consider these two scenarios:

  1. Choose Shared Schema (Method 3): If you are building a general-purpose app, need to optimize costs, and want to scale quickly. Use ORM libraries that support Global Scopes to automatically inject the tenant_id into queries.
  2. Choose Database-per-Tenant (Method 1): If your clients are in the banking or government sectors. They require data to reside on separate storage to meet strict security standards.

I often prefer a Hybrid solution. Free trial users stay in one large shared database. High-paying VIP customers are “moved” to their own dedicated database. This balances costs while retaining your most important clients.

Designing a database for SaaS is a long journey. Don’t wait until your data is a mess to look for a solution. Hopefully, these insights help you avoid unnecessary pitfalls!

Share: