This story starts with an e-commerce project I worked on a few years back. The system needed to store “attributes” for each product type: shirts had size, color, and material — shoes had size number, color, and sole type. Every product category was completely different.
The solution back then was EAV (Entity-Attribute-Value) — a product_attributes table with three columns: entity_id, attribute_name, and value. It worked, but every query required 5–6 JOINs, and the EXPLAIN output was painful to look at. Then MySQL 5.7.8 introduced native JSON data type — something I wish I’d known about sooner.
What Is JSON Data Type, and Why Not Just Use TEXT?
The first question everyone asks is: “Can’t I just store TEXT and parse it in the application layer?” — Sure, you can. But TEXT and JSON are not the same thing. MySQL’s JSON type offers three things TEXT simply can’t do:
- Automatic validation: MySQL rejects INSERT statements if the value is not valid JSON. TEXT silently accepts garbage into your database.
- Binary storage: Data is stored in an optimized binary format, making nested field access faster — no need to parse the entire string like TEXT.
- Native functions:
JSON_EXTRACT,JSON_SET,JSON_CONTAINS… usable directly in SQL. No need to push processing up to the application layer.
The maximum size is bounded by max_allowed_packet — 16MB by default. In practice, most projects never come close to this limit.
Creating Tables and Inserting Data
Create a products table with an attributes column of type JSON:
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price DECIMAL(10, 2),
attributes JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
The insert syntax is no different from storing a regular string — just make sure it’s valid JSON:
INSERT INTO products (name, price, attributes) VALUES
(
'Basic T-Shirt',
299000,
'{"size": ["S", "M", "L", "XL"], "color": "white", "material": "cotton", "in_stock": true}'
),
(
'Running Pro Shoes',
1850000,
'{"size": [39, 40, 41, 42, 43], "color": "black", "waterproof": false, "weight_g": 320}'
);
Try inserting malformed JSON — MySQL rejects it immediately, no waiting for the application layer:
-- This line will be rejected:
INSERT INTO products (name, attributes) VALUES ('Test', '{invalid json}');
-- ERROR 3140: Invalid JSON text at position 1...
Hands-On: Querying JSON Data
Reading Nested Fields with -> and ->>
These two operators are the ones I use most every day:
-- -> returns a JSON value (string with double quotes)
SELECT name, attributes->'$.color' AS color FROM products;
-- Result: "white"
-- ->> returns plain text (unquoted string)
SELECT name, attributes->>'$.color' AS color FROM products;
-- Result: white
-- Access array element (index starts at 0)
SELECT name, attributes->>'$.size[0]' AS smallest_size FROM products;
Filtering by JSON Conditions
-- Find products made from cotton
SELECT id, name, price
FROM products
WHERE attributes->>'$.material' = 'cotton';
-- JSON_CONTAINS: check if the array contains a value
SELECT id, name
FROM products
WHERE JSON_CONTAINS(attributes->'$.size', '"M"');
-- Combine multiple conditions
SELECT id, name, price
FROM products
WHERE JSON_EXTRACT(attributes, '$.in_stock') = true
AND price < 500000;
Updating Values in JSON
-- JSON_SET: add or update a field
UPDATE products
SET attributes = JSON_SET(attributes, '$.discount_pct', 15)
WHERE id = 1;
-- JSON_REMOVE: remove a field
UPDATE products
SET attributes = JSON_REMOVE(attributes, '$.discount_pct')
WHERE id = 1;
-- JSON_MERGE_PATCH: merge objects (overwrites duplicate keys)
UPDATE products
SET attributes = JSON_MERGE_PATCH(attributes, '{"new_arrival": true, "season": "summer"}')
WHERE id = 2;
Performance and Indexes — The Most Common Pitfall
MySQL cannot index a JSON column directly. This is the classic trap — a table with hundreds of thousands of rows, querying by a JSON field, execution time jumping to 2–3 seconds with no obvious cause in the code. Only running EXPLAIN reveals the culprit: “Full table scan”.
The cleanest workaround is a Generated Column: create a virtual column that’s automatically derived from the JSON field, then index that column:
-- Add a virtual generated column from a JSON field
ALTER TABLE products
ADD COLUMN material VARCHAR(50)
GENERATED ALWAYS AS (attributes->>'$.material') VIRTUAL;
-- Create an index on the generated column
CREATE INDEX idx_material ON products (material);
-- This query now uses an index instead of a full scan
SELECT id, name FROM products WHERE material = 'cotton';
-- EXPLAIN will show "Using index" instead of "Full table scan"
Using STORED instead of VIRTUAL physically writes the value to disk — slightly faster reads at the cost of additional storage. For a boolean field frequently used in filters:
ALTER TABLE products
ADD COLUMN in_stock BOOLEAN
GENERATED ALWAYS AS (JSON_EXTRACT(attributes, '$.in_stock')) STORED,
ADD INDEX idx_in_stock (in_stock);
JSON_TABLE — Flattening JSON Arrays into Multiple Rows (MySQL 8.0+)
Sometimes you need to “unfold” a JSON array into multiple rows for tabular processing. JSON_TABLE handles this directly in SQL, without pulling data up to the application layer and looping:
-- List each size of each product as a separate row
SELECT
p.id,
p.name,
s.size_value
FROM products p,
JSON_TABLE(
p.attributes,
'$.size[*]'
COLUMNS (size_value VARCHAR(10) PATH '$')
) AS s
WHERE p.id = 1;
Result:
+----+---------------+------------+
| id | name | size_value |
+----+---------------+------------+
| 1 | Basic T-Shirt | S |
| 1 | Basic T-Shirt | M |
| 1 | Basic T-Shirt | L |
| 1 | Basic T-Shirt | XL |
+----+---------------+------------+
When to Use It, When Not To
After several production deployments, I’ve drawn a fairly clear line. JSON columns work well for:
- Flexible, frequently-changing schemas: product attributes, metadata, per-user config, feature flags
- Data typically read as a whole object: rarely needing to filter by individual nested fields
- Fast prototyping: when the schema isn’t settled yet and you want to iterate without running migrations every time
Avoid it when:
- You frequently run complex queries against multiple JSON fields — normalizing into separate tables is cleaner and the SQL is much clearer
- Data requires foreign key constraints — JSON columns don’t support them
- The team is unfamiliar with MySQL JSON functions — it’s easy to write queries that silently skip indexes
Conclusion
MySQL JSON type isn’t a magic bullet, but for the right use case it solves the flexible-schema problem cleanly — no messy EAV patterns, no extra extensions needed. My go-to pattern: normalize the important fields (frequently filtered or sorted) into dedicated indexed columns; keep supplementary metadata in the JSON column.
Both MySQL 5.7+ and 8.0 ship with JSON type built in, no additional installation needed. Open a terminal, create a test table, and run through the examples above — it’s a completely different experience compared to parsing JSON in the application layer and guessing which fields exist.

