MySQL Document Store: Use MySQL as NoSQL with X DevAPI — No Need to Switch Databases

MySQL tutorial - IT technology blog
MySQL tutorial - IT technology blog

The Problem: Rigid Schemas When Data Structure Keeps Changing

I ran into this while building a product attributes feature for an e-commerce app. Each product type has a completely different attribute structure — laptops have RAM/CPU/SSD, clothing has size/color/material, food has expiration dates and ingredients. The traditional approach is either an EAV (Entity-Attribute-Value) table with key/value pairs, or a separate table for each product type.

Neither option was great. EAV tables mean complex queries, multi-level JOINs, and painful debugging. Separate tables mean writing a new migration every time you add a product type — which gets exhausting for a small team without a dedicated DBA.

Someone on the team suggested migrating that part to MongoDB. It sounded reasonable, but the entire system was already running on MySQL — transactions, foreign keys, and reporting all depended on it. Adding another database engine to the stack means unnecessary complexity: two places to monitor, two places to back up, and on-call engineers need to know both systems.

The Root Cause: Not Knowing MySQL Already Had This Feature

The truth is, it’s not that MySQL couldn’t do it — I just didn’t know the feature existed. MySQL 5.7 already introduced a native JSON type. And starting with MySQL 8.0, they added MySQL Document Store combined with X DevAPI directly into the core.

MySQL Document Store lets you create Collections — similar to MongoDB collections — that store JSON documents without needing to define a schema upfront. Under the hood it’s still InnoDB, still fully ACID-compliant, but the query interface adds a document-oriented API. X DevAPI runs on port 33060 (alongside the standard port 3306) and supports both traditional SQL and document-style CRUD within the same session.

Approaches to the Problem

Option 1: JSON Column in a Regular SQL Table

The simplest approach — add an attributes JSON column to the products table and query it with JSON_EXTRACT() or the ->> operator. This works fine for basic JSON reads and writes. But once you need to index multiple nested fields, or when your backend code wants a document-style API instead of string-concatenated SQL, it starts getting unwieldy.

Option 2: MySQL Document Store with X DevAPI

First, install MySQL Shell — the client that supports the X Protocol:

# Ubuntu/Debian
sudo apt install mysql-shell

# Or install the Python connector with X DevAPI support
pip install mysql-connector-python

Connect using the X Protocol (port 33060) and create a collection:

# Connect to MySQL Shell using X Protocol
mysqlsh root@localhost:33060 --js
// Inside MySQL Shell (JavaScript mode)
var db = session.createSchema('product_catalog');
var products = db.createCollection('products');

// Add a document — no schema definition required
products.add({
  name: "Laptop Dell XPS 15",
  category: "laptop",
  attributes: {
    cpu: "Intel Core i7-13700H",
    ram: "32GB DDR5",
    storage: "1TB NVMe SSD"
  },
  price: 35000000,
  in_stock: true
}).execute();

// A completely different document type — different schema, no errors
products.add({
  name: "Basic T-Shirt",
  category: "clothing",
  attributes: {
    sizes: ["S", "M", "L", "XL"],
    colors: ["black", "white", "gray"],
    material: "100% cotton"
  },
  price: 250000,
  in_stock: true
}).execute();

// Query documents
products.find("category = 'laptop' AND price > 30000000")
  .fields("name", "price")
  .order_by("price DESC")
  .execute();

// Search by nested field
products.find("attributes.ram = '32GB DDR5'").execute();

Using it from a real Python backend:

import mysqlx

# Connect via X DevAPI
session = mysqlx.get_session({
    'host': 'localhost',
    'port': 33060,
    'user': 'root',
    'password': 'your_password'
})

db = session.get_schema('product_catalog')
products = db.get_collection('products')

# Add a document
result = products.add({
    "name": "MacBook Pro M3",
    "category": "laptop",
    "attributes": {
        "chip": "Apple M3 Pro",
        "ram": "18GB Unified Memory",
        "storage": "512GB SSD"
    },
    "price": 55000000
}).execute()

print(f"Added ID: {result.get_generated_ids()}")

# Query with filter and sort
docs = products.find("in_stock = true AND category = :cat") \
    .bind('cat', 'laptop') \
    .fields("name", "price", "attributes") \
    .order_by("price DESC") \
    .limit(10) \
    .execute()

for doc in docs.fetch_all():
    print(f"{doc['name']} — {doc['price']:,} VND")

session.close()

Creating Indexes to Avoid Slow Queries

I learned this lesson the hard way when a users table grew past 10 million rows — slow queries started appearing and I had to painstakingly tune the indexes. Document Store runs into the same problem if you don’t set up proper indexes from the start:

// Index for the category field (heavily used in WHERE clauses)
products.createIndex('category_idx', {
  fields: [{
    field: '$.category',
    type: 'TEXT(50)'
  }]
});

// Index for the numeric price field
products.createIndex('price_idx', {
  fields: [{
    field: '$.price',
    type: 'DECIMAL(15,2)'
  }]
});

// Verify the indexes were created
session.sql('SHOW INDEX FROM product_catalog.products').execute();

The Best Approach: Combining Relational and Document in the Same Database

After trying different approaches, I found that the most effective solution isn’t choosing one or the other — it’s using both within the same MySQL instance.

Split your data based on its characteristics:

  • Regular SQL tables: orders, users, payments — fixed structure, needs JOINs, needs full transactional integrity.
  • Document Collections: products, product_reviews, user_preferences — schema varies by type, no complex JOINs needed.

Within the same MySQL Shell session, mixing SQL and the Document API is completely seamless:

// SQL for orders (relational — needs JOINs)
session.sql(`
  SELECT o.id, o.total, u.email
  FROM orders o
  JOIN users u ON o.user_id = u.id
  WHERE o.status = 'pending'
`).execute();

// Document API for product catalog (flexible schema)
var catalog = session.getSchema('product_catalog');
catalog.getCollection('products')
  .find("in_stock = true")
  .limit(20)
  .execute();

Practical Notes to Keep in Mind

  • Port 33060: X Protocol uses a dedicated port. Verify it with SHOW VARIABLES LIKE 'mysqlx_port'. Remember to open your firewall if the app server is on a different machine.
  • Document ID: MySQL auto-generates a _id as a 28-character string (not an ObjectId like MongoDB). You can set your own ID when adding a document if you prefer.
  • Transactions: X DevAPI supports full transactions — session.startTransaction()session.commit(). This is a significant advantage over MongoDB when you need ACID across multiple collections.
  • Backup/monitoring: Shared with your existing MySQL setup — mysqldump, xtrabackup, and the Prometheus MySQL exporter all work as normal, no additional configuration needed.

What I found most convenient after switching to this approach: no extra infrastructure to maintain, no new query language to learn from scratch, and the team already knew MySQL — they just needed to pick up the X DevAPI syntax. As the data grows and optimization becomes necessary, you’re working with familiar tools instead of debugging two separate systems side by side.

Share: