The Problem: Rigid Schemas When Data Structure Keeps Changing
I ran into this while building a product attributes feature for an e-commerce app. Each product type has a completely different attribute structure — laptops have RAM/CPU/SSD, clothing has size/color/material, food has expiration dates and ingredients. The traditional approach is either an EAV (Entity-Attribute-Value) table with key/value pairs, or a separate table for each product type.
Neither option was great. EAV tables mean complex queries, multi-level JOINs, and painful debugging. Separate tables mean writing a new migration every time you add a product type — which gets exhausting for a small team without a dedicated DBA.
Someone on the team suggested migrating that part to MongoDB. It sounded reasonable, but the entire system was already running on MySQL — transactions, foreign keys, and reporting all depended on it. Adding another database engine to the stack means unnecessary complexity: two places to monitor, two places to back up, and on-call engineers need to know both systems.
The Root Cause: Not Knowing MySQL Already Had This Feature
The truth is, it’s not that MySQL couldn’t do it — I just didn’t know the feature existed. MySQL 5.7 already introduced a native JSON type. And starting with MySQL 8.0, they added MySQL Document Store combined with X DevAPI directly into the core.
MySQL Document Store lets you create Collections — similar to MongoDB collections — that store JSON documents without needing to define a schema upfront. Under the hood it’s still InnoDB, still fully ACID-compliant, but the query interface adds a document-oriented API. X DevAPI runs on port 33060 (alongside the standard port 3306) and supports both traditional SQL and document-style CRUD within the same session.
Approaches to the Problem
Option 1: JSON Column in a Regular SQL Table
The simplest approach — add an attributes JSON column to the products table and query it with JSON_EXTRACT() or the ->> operator. This works fine for basic JSON reads and writes. But once you need to index multiple nested fields, or when your backend code wants a document-style API instead of string-concatenated SQL, it starts getting unwieldy.
Option 2: MySQL Document Store with X DevAPI
First, install MySQL Shell — the client that supports the X Protocol:
# Ubuntu/Debian
sudo apt install mysql-shell
# Or install the Python connector with X DevAPI support
pip install mysql-connector-python
Connect using the X Protocol (port 33060) and create a collection:
# Connect to MySQL Shell using X Protocol
mysqlsh root@localhost:33060 --js
// Inside MySQL Shell (JavaScript mode)
var db = session.createSchema('product_catalog');
var products = db.createCollection('products');
// Add a document — no schema definition required
products.add({
name: "Laptop Dell XPS 15",
category: "laptop",
attributes: {
cpu: "Intel Core i7-13700H",
ram: "32GB DDR5",
storage: "1TB NVMe SSD"
},
price: 35000000,
in_stock: true
}).execute();
// A completely different document type — different schema, no errors
products.add({
name: "Basic T-Shirt",
category: "clothing",
attributes: {
sizes: ["S", "M", "L", "XL"],
colors: ["black", "white", "gray"],
material: "100% cotton"
},
price: 250000,
in_stock: true
}).execute();
// Query documents
products.find("category = 'laptop' AND price > 30000000")
.fields("name", "price")
.order_by("price DESC")
.execute();
// Search by nested field
products.find("attributes.ram = '32GB DDR5'").execute();
Using it from a real Python backend:
import mysqlx
# Connect via X DevAPI
session = mysqlx.get_session({
'host': 'localhost',
'port': 33060,
'user': 'root',
'password': 'your_password'
})
db = session.get_schema('product_catalog')
products = db.get_collection('products')
# Add a document
result = products.add({
"name": "MacBook Pro M3",
"category": "laptop",
"attributes": {
"chip": "Apple M3 Pro",
"ram": "18GB Unified Memory",
"storage": "512GB SSD"
},
"price": 55000000
}).execute()
print(f"Added ID: {result.get_generated_ids()}")
# Query with filter and sort
docs = products.find("in_stock = true AND category = :cat") \
.bind('cat', 'laptop') \
.fields("name", "price", "attributes") \
.order_by("price DESC") \
.limit(10) \
.execute()
for doc in docs.fetch_all():
print(f"{doc['name']} — {doc['price']:,} VND")
session.close()
Creating Indexes to Avoid Slow Queries
I learned this lesson the hard way when a users table grew past 10 million rows — slow queries started appearing and I had to painstakingly tune the indexes. Document Store runs into the same problem if you don’t set up proper indexes from the start:
// Index for the category field (heavily used in WHERE clauses)
products.createIndex('category_idx', {
fields: [{
field: '$.category',
type: 'TEXT(50)'
}]
});
// Index for the numeric price field
products.createIndex('price_idx', {
fields: [{
field: '$.price',
type: 'DECIMAL(15,2)'
}]
});
// Verify the indexes were created
session.sql('SHOW INDEX FROM product_catalog.products').execute();
The Best Approach: Combining Relational and Document in the Same Database
After trying different approaches, I found that the most effective solution isn’t choosing one or the other — it’s using both within the same MySQL instance.
Split your data based on its characteristics:
- Regular SQL tables:
orders,users,payments— fixed structure, needs JOINs, needs full transactional integrity. - Document Collections:
products,product_reviews,user_preferences— schema varies by type, no complex JOINs needed.
Within the same MySQL Shell session, mixing SQL and the Document API is completely seamless:
// SQL for orders (relational — needs JOINs)
session.sql(`
SELECT o.id, o.total, u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
`).execute();
// Document API for product catalog (flexible schema)
var catalog = session.getSchema('product_catalog');
catalog.getCollection('products')
.find("in_stock = true")
.limit(20)
.execute();
Practical Notes to Keep in Mind
- Port 33060: X Protocol uses a dedicated port. Verify it with
SHOW VARIABLES LIKE 'mysqlx_port'. Remember to open your firewall if the app server is on a different machine. - Document ID: MySQL auto-generates a
_idas a 28-character string (not an ObjectId like MongoDB). You can set your own ID when adding a document if you prefer. - Transactions: X DevAPI supports full transactions —
session.startTransaction()→session.commit(). This is a significant advantage over MongoDB when you need ACID across multiple collections. - Backup/monitoring: Shared with your existing MySQL setup — mysqldump, xtrabackup, and the Prometheus MySQL exporter all work as normal, no additional configuration needed.
What I found most convenient after switching to this approach: no extra infrastructure to maintain, no new query language to learn from scratch, and the team already knew MySQL — they just needed to pick up the X DevAPI syntax. As the data grows and optimization becomes necessary, you’re working with familiar tools instead of debugging two separate systems side by side.

