The Real Problem: Slow Searches with LIKE in Applications
Surely, as developers, we’ve all had to build search functionality for some application. For small projects with limited data, using the command SELECT * FROM posts WHERE title LIKE '%search_term%' OR content LIKE '%search_term%' seems quite acceptable. It’s simple, easy to understand, and runs instantly.
However, as projects grow and data volume increases significantly—for example, my production database running MySQL 8.0 has about 50GB of data—the query LIKE '%keyword%' can become a disaster. With just a general keyword, response times will skyrocket, the server will crawl, and users will be frustrated waiting for every second.
Furthermore, the LIKE method has several other limitations:
- No Relevance Ranking Capability: Results are simply binary—either the keyword is present or not. You cannot determine which documents are most relevant to the user’s search query.
- Poor Performance: With
%keyword%, MySQL cannot effectively use an index on the search column. This leads to a full table scan—extremely resource-intensive and time-consuming, especially with large data tables. - Challenges with Vietnamese: Handling diacritics, synonyms, or suggested searches is almost impossible with
LIKEwithout a lot of complex application-side logic.
Root Cause Analysis: Why is LIKE So Bad for Full-Text Search?
The core problem with LIKE '%keyword%' lies in how MySQL processes it. When you place a wildcard character (%) at the beginning of the search phrase, MySQL cannot effectively use regular indexes (like B-Tree indexes). Instead, it’s forced to scan through each row, comparing the column’s content with the search pattern. Imagine your database has millions of articles; this is like reading every page of millions of books to find a single keyword!
B-Tree indexes are designed for ordered searches. For instance, they are very effective when searching for values that start with ‘keyword’ ('keyword%') or for an exact value. But when you want to find values that contain ‘keyword’ ('%keyword%'), the B-Tree structure is no longer useful.
Furthermore, LIKE only performs simple character string matching. It doesn’t understand context, has no concept of a word, or natural language processing capabilities. Therefore, for LIKE, searching for “phone” and “smartphone” are two entirely different queries, even though the user might expect to see similar results.
Other Solutions for the Search Problem
Before diving into today’s main solution, I’ll briefly review a few other approaches you might already know or have encountered:
1. Still Using LIKE, But…
Some of you might consider using LIKE 'keyword%' (without % at the beginning) to leverage indexes. This method can be faster if users always search by prefix. But clearly, this is not a full-text search. Users typically want to search for keywords anywhere in the text.
Another approach is to build a separate keyword table. You would analyze the article content into keywords and store them in this table, then search on the keyword table. However, this method is often very labor-intensive to implement and maintain, especially when the original content changes frequently.
2. Using External Search Engines (Elasticsearch, Solr)
For ultra-large systems with complex search requirements and superior performance, specialized tools like Elasticsearch or Apache Solr are the top choice. They are purpose-built for searching, offer powerful scalability, and support natural language processing, fuzzy search, faceted search, etc. However, integrating and managing these systems is quite complex. They require additional servers, deep expertise, and significant operational costs.
3. MySQL Full-Text Search: The Ideal Balanced Solution
This is the solution I want to introduce today. MySQL provides built-in Full-Text Search (FTS) functionality, allowing you to perform full-text searches efficiently right within the database.
FTS offers an excellent balance between the simplicity of LIKE and the power of specialized search engines. On my 50GB production database with MySQL 8.0, these optimizations have significantly improved query speeds (e.g., from tens of seconds to under 1 second), substantially reducing server load and enhancing user experience.
Best Practice: How to Use MySQL Full-Text Search
MySQL Full-Text Search works by creating special indexes (Full-Text Indexes) on text columns. When you perform a search, MySQL uses these indexes to quickly locate documents containing keywords. It can even calculate the relevance of each result.
1. Full-Text Search Modes
MySQL FTS supports three main query modes:
- IN NATURAL LANGUAGE MODE: The default mode, which searches for keywords provided in natural language and returns results in descending order of relevance.
- IN BOOLEAN MODE: Allows for more detailed control by using Boolean operators (e.g.,
+,-,"") to specify required words, prohibited words, or exact phrases. - WITH QUERY EXPANSION: Expands the initial query by including related words (from the most relevant documents) to search for additional potential results.
2. Creating a Full-Text Index
To use FTS, you first need to create a FULLTEXT index on the columns you want to search. Suppose we have a posts table as follows:
CREATE TABLE posts (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL,
content TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
To add a FULLTEXT index, you can do the following:
ALTER TABLE posts ADD FULLTEXT(title, content);
Or create it directly when defining the table:
CREATE TABLE posts (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL,
content TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FULLTEXT(title, content)
);
Important note: A FULLTEXT index only works on columns with data types CHAR, VARCHAR, or TEXT.
3. Performing Searches with Full-Text Search
3.1. Searching in Natural Language Mode (IN NATURAL LANGUAGE MODE)
This is the simplest and most common mode. MySQL will automatically evaluate and rank the relevance of the returned results.
SELECT id, title, content,
MATCH(title, content) AGAINST ('using mysql' IN NATURAL LANGUAGE MODE) AS relevance_score
FROM posts
WHERE MATCH(title, content) AGAINST ('using mysql' IN NATURAL LANGUAGE MODE)
ORDER BY relevance_score DESC;
In the statement above, please note:
MATCH(title, content): Specifies theFULLTEXTindexed columns you want to search.AGAINST ('using mysql' IN NATURAL LANGUAGE MODE): Provides the search phrase and search mode.AS relevance_score: MySQL returns a numerical value indicating the relevance of each result. The higher the value, the more closely related the result is to the search keyword.
3.2. Searching in Boolean Mode (IN BOOLEAN MODE)
This mode allows you to precisely control your query by using special operators:
+: This word *must* be present in the result.-: This word *must not* be present in the result.<and>: Change the word’s contribution to relevance (e.g.,>database <sqlprioritizes ‘database’ over ‘sql’)."keyword": Searches for the exact phrase.*: Wildcard operator at the end of a word (e.g.,mysql*will findmysql,mysql_db, etc.).
-- Find articles containing 'mysql' but not 'replication'
SELECT id, title FROM posts
WHERE MATCH(title, content) AGAINST ('+mysql -replication' IN BOOLEAN MODE);
-- Find the exact phrase 'mysql guide'
SELECT id, title FROM posts
WHERE MATCH(title, content) AGAINST ('"mysql guide"' IN BOOLEAN MODE);
-- Search for 'database' and 'sql', prioritizing 'database'
SELECT id, title FROM posts
WHERE MATCH(title, content) AGAINST ('>database <sql' IN BOOLEAN MODE);
3.3. Searching with Query Expansion (WITH QUERY EXPANSION)
This mode is particularly useful when you are unsure about the exact keywords to search for. MySQL will perform an initial search, then use related keywords from the most relevant documents to perform a second search, expanding the scope of results.
SELECT id, title FROM posts
WHERE MATCH(title, content) AGAINST ('sql' WITH QUERY EXPANSION);
4. Configuring and Optimizing Full-Text Search
For FTS to work most effectively, especially with Vietnamese, you need to pay attention to some important configurations:
ft_min_word_len: This is the minimum word length for MySQL to include in the FTS index. The default is 4. In Vietnamese, many words are shorter than 4 characters (e.g., “ăn” (eat), “ở” (live), “đi” (go)). You can reduce this value in themy.cnformy.inifile:
[mysqld]
ft_min_word_len = 2
After making changes, you need to restart MySQL and **must rebuild the FTS index** (DROP and ADD the index again) for the changes to take effect.
ft_stopword_file: MySQL has a default list of stopwords (e.g., “a”, “the”, “is” in English) that it ignores during indexing and searching. For Vietnamese, you should create a separate file containing common Vietnamese stopwords (like “là” (is), “và” (and), “có” (has), “được” (get/be)) to remove them from the index. This helps make search results more accurate and faster.
[mysqld]
ft_stopword_file = /path/to/your/vietnamese_stopwords.txt
Similarly, after making changes, you need to restart MySQL and rebuild the FTS index.
- Character Set and Collation: Ensure your tables and columns use the
utf8mb4character set andutf8mb4_unicode_cicollation (orutf8mb4_vietnamese_ciif available). This configuration helps MySQL properly handle Vietnamese diacritics, ensuring accurate search results.
Applying these optimizations on my 50GB production database significantly improved search speeds, providing a smoother experience for users.
5. Limitations and When to Use It
MySQL Full-Text Search is an excellent choice for typical web applications, blogs, or systems with moderate internal search requirements. It is simple to implement and manage. However, MySQL’s FTS still has some limitations:
- Cannot Match Specialized Tools: For extremely complex requirements like fuzzy searching, smart keyword suggestions, or searching across various types of unstructured data, Elasticsearch or Solr remain more suitable and optimal choices.
- Only Supports Certain Data Types: The
FULLTEXTindex can only be used forCHAR,VARCHAR, andTEXT. - Requires Index Rebuild on Configuration Change: Each time you change configuration parameters like
ft_min_word_lenorft_stopword_file, you must rebuild the FTS index. This process can be quite time-consuming for large data tables.
Conclusion
In summary, MySQL Full-Text Search is a powerful and extremely useful feature for implementing fast, efficient search functionality in many applications. It thoroughly addresses the poor performance issues of LIKE and provides relevance ranking capabilities without needing complex external search systems. With just a few minor configuration tweaks and using the correct query mode, you can significantly enhance your users’ search experience. Try it and feel the difference!

