Have you ever typed a keyword in a search field and gotten minimal or no results—only to realize later it's because you used the “wrong” wording or made a typo? Two expanded search methods can fix this problem: query expansion and fuzzy search.
Query expansion, which is the focus of this article, zeros in on the intended meaning. For example, if you type in “headache pain,” query expansion would understand your intended meaning and might include “migraine treatment” or “head pain remedy” in the search results. A search for “online learning” might return “e-learning,” “virtual classrooms,” or “distance education.”
Fuzzy search gives you a result even if the word you enter is slightly misspelled. It works by counting how many small edits (adding, deleting, or swapping a single letter) it would take to get you to the correct word. For example, “JavaScirpt” is only two letter moves away (i and r) from “JavaScript” and “databse” is only one letter move away from “database.” When just a few letters are involved, fuzzy search assumes you’re close to the real word, so it can easily guess the word you meant.
MongoDB offers both methods, but in this article, we’ll dig deeper into query expansion techniques and tools like statistical models, large language models (LLMs), and retrieval-augmented generation (RAG) to uncover how they provide meaningful results beyond the exact query terms.
Table of contents
- How is query expansion different from a basic search system?
- Query expansion: Automatic, manual, and interactive
- Query expansion techniques: How they work
- How query expansion is used in different situations
- Conclusion
How is query expansion different from a basic search system?
Basic search is literal; it only looks up the exact words you typed in your initial query. That works sometimes but often leads to no results or the wrong ones. If you search for "annual report," but what you're looking for is labeled "yearly financial summary" in the system, a full-text search might miss it. However, if the system employs query expansion, it goes beyond the basic search and automatically adds additional queries with related terms, so even if your keyword wasn't exact, query expansion helps the system to understand what you meant.
Query expansion: Automatic, manual, and interactive
Query expansion can be set up in a few different ways. Depending on how much additional input is required from the user, it may be automatic, manual, or interactive.
Automatic query expansion
This is the most common form of query expansion. When a search is run, the system automatically adds related terms that could improve the results, making the search more robust than exact-text matching.
Search systems often combine query expansion with a related, but different, process called stemming. Stemming doesn’t add new terms. Instead, it reduces words to their root form so all variations of the word are matched. In MongoDB, for example, a search for “run” will also return “running” and “ran.” Together, query expansion and stemming both broaden results, but they work in different ways: One adds related concepts, the other handles word forms.
Manual query expansion
This method uses expert-made lists of related terms that apply to certain professions, like medicine or law. These specialized lists improve accuracy in fields where retrieving the exact words matter. For example, “heart attack” can be linked to “myocardial infarction.” The downside is they take time to create and must be updated regularly.
Interactive query expansion
In the interactive method, the system invites the user to help it refine the search. After the initial query, it suggests related terms and offers you a chance to add them and conduct another search. You can accept or reject them based on what you’re looking for, but this method gives you more control over the results and can lead to more relevant matches. It does require more than one search request, so it may not fit every application.
Query expansion techniques: How they work
As discussed earlier, query expansion helps a search system understand what you mean, not just what you type. In our “annual report” example, it would find a document labeled “yearly financial summary,” as well as other resources that align with your search because it recognizes that the two phrases mean the same thing.
The type of query expansion technique used often depends on the goals of the company, the user, or the application. Some systems may prioritize speed, while others focus on precision or the ability to handle large or varied datasets. Query expansion is flexible because it can be tailored with different methods to meet these diverse needs.
Knowledge-based methods: Dictionaries, thesauri, and ontologies
Thesauri and dictionaries
One of the easiest ways to add depth to a search is with a list of synonyms. When you search for a word like “quick,” a dictionary- or thesaurus-based system automatically includes related terms like “fast” or “rapid” to broaden the results. This helps the user find documents that use different wording to express the same idea. For instance, the synonyms option in MongoDB’s Atlas Search allows developers to use synonym mappings, which lets them define which terms should be treated as equivalent.
However, this type of query expansion has some limitations. It works great for common words but can make incorrect decisions, such as expanding the word “bond” without knowing whether it refers to finance or chemistry.
Ontologies
Ontologies are especially useful when search queries require nuanced understanding, like terms found in finance, computer science, legal, and medical fields. They organize related terms and concepts by explicitly mapping out their relationships, helping search systems understand the precise meaning of a term in context. This is particularly important in domains where accuracy and specificity matter.
The process typically has two steps: First, the system determines the correct meaning of a word (called word sense disambiguation), and then it expands the query based on that specific context.
First step: Word sense disambiguation
When a keyword search starts, the ontological system employs word sense disambiguation to establish the correct context of the term before query expansion takes place.
For example, if you search for "Python lessons," the system will automatically determine if you're referring to the snake or the programming language. It does this because the ontology's structure shows that the word "lessons" is strongly associated with concepts like "learning" and "software," and these terms are primarily related to the "programming language" meaning of the word "Python." If you search for "python skin," it would strongly connect "skin" to "reptile" or "animal.
Once it's known that “Python lessons” refer to programming, expansion can begin.
Second step: Query expansion
The search engine uses query expansion to add related terms to your query. For example, a search for “Python programming language lessons” might also include: “programming language” (a broader term), “scripting” (a related concept), and “Python tutorial” (different wording). This makes it easier to find a wider range of matches.
Statistical/corpus-based methods: Global and local analysis
Unlike pre-built knowledge methods such as thesauri or ontologies, statistical methods rely on counting and comparing numbers related to word use instead of understanding the meaning. Statistical methods assess how often words appear alone and how often they appear together or in similar contexts within extensive text collections (called a corpus).
Based on these numerical patterns, or "statistics," the system infers which words might be related. This statistical information is then used to perform query expansion, typically via one of two main techniques: global analysis or local analysis. Today, some systems even combine these traditional approaches with deep learning to improve accuracy, blending statistical signals with neural models that better capture meaning.
Global analysis: Finding patterns across the entire corpus
Global analysis performs its statistical examination across the entire corpus, usually during a one-time offline setup phase before users start searching. By analyzing how often different words appear together or in similar documents throughout the whole corpus, it builds a general map or statistical thesaurus.
This map shows which terms tend to associate with each other based on broad usage patterns in the dataset. The main drawback is that looking only at the “big picture” can be misleading for words that have multiple meanings. For example, the system might connect a vague word to a common meaning that isn’t what the user actually wanted.
When setup is complete and users start searching, the pre-computed map uses query expansion to look up the user's terms and add statistically significant words to the original query. By adding the related terms to the original query, the system creates a longer, richer query that has a better chance of matching relevant documents, even if those documents don't contain the exact words the user typed.
Local analysis and pseudo-relevance feedback: Finding patterns in top results
In global analysis, the system examines the entire corpus offline beforehand. Local analysis works differently. The most common technique for local analysis is pseudo-relevance feedback, or PRF. Instead of using a pre-built map, PRF dynamically analyzes only the top few documents returned by your specific query at the moment you search. It's called “pseudo” or “blind” feedback because the system automatically assumes these initial top documents (say, the top five or 10) are relevant without asking you for confirmation. These top-ranked results are often referred to as pseudo relevant documents, since the system treats them as if they were truly relevant even though the user hasn’t validated them.
Local analysis:
- Runs your initial query to get the first batch of results.
- Takes the top few results (e.g., the first five to 10).
- Scans those results for useful terms that weren’t in your original query.
- Adds the best of those terms to your query and run the search again to improve the results.
The main benefit of PRF (as the primary local analysis technique) is its ability to find more relevant documents by adding new terms to your search. However, the biggest risk with PRF is query drift. If those first few documents used for feedback happen to be off-topic, the system may pick up the wrong terms and make the next search even less precise, pulling the search further away from your actual goal.
Query log analysis: Learning from user behavior
Query log analysis doesn't analyze documents or set up knowledge maps. Instead, it accesses historical search logs to examine a user's past search behavior before employing query expansion. Heavily used by web search engines, this technique looks for patterns in the archived data.
For example, if millions of users searching for term X frequently follow up by searching for term Y, or if users searching for X consistently click on results containing term Y, the system statistically infers a strong relationship and might use Y to expand future searches for X.
While the logs are collected continuously, the complex analysis to identify these common patterns and determine which terms are strongly related is typically performed periodically offline. This preprocessing prepares the learned relationships for quick use when a new search comes in.
For this approach to be effective, it has to analyze massive amounts of search data, which is perfect for popular queries but falls short when it comes to rare, new, or unusual queries that aren't logged often. Additionally, privacy must be considered when accessing a user's search logs.
Modern AI-driven methods
The last set of query expansion techniques we’ll explore in this article uses artificial intelligence (AI) and includes word embeddings, large language models (LLMs), and retrieval-augmented generation (RAG). This set of methods goes beyond surface patterns or predefined lists to understand the deeper meaning of words and queries.
Word embeddings
One way to visualize how word embeddings are structured is to imagine yourself walking through an experimental art museum with art fastened to the walls, hanging from the ceiling, and placed on the floor, effectively touching every bit of space in the room. Each art piece has a numerical coordinate (vector) to represent its position in the room. Art with similar styles or from the same period are placed in the same group because their similarities align them into that exact location (vector).
Just like artwork grouped by style or era in a museum, words with similar meanings tend to cluster together in a “meaning space.” Each word is represented in this space by a numerical coordinate—a vector—that captures its relationship to other words. When you enter a keyword search, the system locates its vector and identifies nearby vectors—words with related meanings. These closely placed words expand your original query with more contextually relevant terms.
Large language models (LLMs)
Imagine trying to teach a computer to learn the nuances of English. Not just dictionary definitions, but how sentences flow, context changes meaning, and people actually talk. LLMs are a subcategory of AI that do just that. They learn language by processing enormous amounts of text, like reading millions of web pages, books, periodicals, and other resources, including multimedia. Because the LLM's training is so rich, it's equipped to handle how words fit together and what people generally mean.
This training process helps the LLM:
- Understand the intent of your search: If you search "good coffee near Kalamazoo Air Zoo," it figures out you want nearby coffee shop recommendations, not just web pages with those exact words.
- Think of related ideas: Searching for "used hybrid cars" might cause the LLM to look for "pre-owned fuel-efficient vehicles" automatically, finding things you might have missed.
- Communicate naturally: The LLM can generate sentences, answer questions in a conversational way, translate, and even write different kinds of creative text.
In short, LLMs add intelligence to search engines, helping them interpret language with context and nuance.
Retrieval-augmented generation (RAG)
LLMs can be smart, but their knowledge is limited to the amount of text they were trained on, and that data can become outdated. Plus, they sometimes get confused or even make things up (a phenomenon known as “hallucination”). To overcome these gaps in accuracy and freshness, a new approach called RAG was developed.
RAG is a type of architecture used to make the LLMs' retrieval process more reliable by extending their reach to include current external information and real-time data before they generate a response.
For example, when you need specific or current information, such as, "What are the Kalamazoo library's hours this week?" (assuming that's relevant information we could retrieve on the library's website), the RAG system searches a specified knowledge base (i.e., the library's actual website or internal documents) to find relevant information. This retrieval step can use query expansion techniques to find the best sources.
Once located, the retrieved documents are sent back to the LLM as context. The LLM then generates its answer, heavily relying on the fresh, retrieved information rather than just its internal training data.
MongoDB’s Atlas Vector Search is helpful in RAG applications because it keeps your data and the extra data it needs to deliver meaning-based search results in the same database, producing quick and relevant results. It also includes hybrid search, which combines vector search with keyword matching to deliver results based on both meaning and exact words.
RAG improves trustworthiness by connecting LLMs to external knowledge sources during response generation, ensuring more accurate, factually grounded information from the underlying database or source.
How query expansion is used in different situations
Now that you’ve learned what query expansion is and how it works, let’s take a look at some everyday situations to show how query expansion works behind the scenes.
E-commerce platforms: Helps customers find products even when they use different terms than the product listing (e.g., searching "winter coat" might expand to include "parka," "anorak," or "down jacket"), improving product discovery and potential sales.
Website and content search: Enables users to locate relevant articles, help documentation, or blog posts without knowing the exact title or keywords ("password help" finding "resetting account credentials" or "login troubleshooting").
Digital libraries and research databases: Assists researchers by automatically including synonyms, related concepts, or standard identifiers (like medical terms) to retrieve comprehensive literature (expanding "heart attack" to include "myocardial infarction").
Enterprise search: Allows employees to find internal documents, reports, or expertise across different departments that might use varied jargon or project names (finding the "Quarterly Revenue Summary" when searching for "Q3 sales report" in the company database).
Legal and compliance: Ensures thorough retrieval of relevant case law, statutes, or regulatory documents by expanding queries to include related legal concepts, precedents, or variations in terminology critical for due diligence and research.
Online job boards: Matches job seekers with relevant positions even if their search terms don't perfectly align with the job titles (expanding "software developer" to include "programmer," "software engineer," or specific coding languages).
Conclusion
Query expansion is a technique that goes beyond the limitations of basic, exact-match searches. By automatically broadening the initial query to include related terms and concepts, it bridges the gap between what a user types and what they actually mean. Whether through ontologies, statistical models, or deep learning, query expansion improves information retrieval across applications. By enhancing every user query with expanded queries, search systems can return more relevant results, helping people get accurate answers that match the intent of their search terms.