Leveraging OpenAI and MongoDB Atlas for Improved Search Functionality

Pavel Duchovny5 min read • Published Dec 04, 2023 • Updated Dec 04, 2023

Node.js AI Atlas Search JavaScript

Rate this tutorial

Search functionality is a critical component of many modern web applications. Providing users with relevant results based on their search queries and additional filters dramatically improves their experience and satisfaction with your app.

In this article, we'll go over an implementation of search functionality using OpenAI's GPT-4 model and MongoDB's Atlas Vector search. We've created a request handler function that not only retrieves relevant data based on a user's search query but also applies additional filters provided by the user.

Enriching the existing documents data with embeddings is covered in our main Vector Search Tutorial.

Search in the Airbnb app context

Consider a real-world scenario where we have an Airbnb-like app. Users can perform a free text search for listings and also filter results based on certain criteria like the number of rooms, beds, or the capacity of people the property can accommodate.

To implement this functionality, we use MongoDB's full-text search capabilities for the primary search, and OpenAI's GPT-4 model to create embeddings that contain the semantics of the data and use Vector Search to find relevant results.

The code to the application can be found in the following GitHub repository.

The request handler

For the back end, we have used Atlas app services with a simple HTTPS “GET” endpoint.

Our function is designed to act as a request handler for incoming search requests. When a search request arrives, it first extracts the search terms and filters from the query parameters. If no search term is provided, it returns a random sample of 30 listings from the database.

If a search term is present, the function makes a POST request to OpenAI's API, sending the search term and asking for an embedded representation of it using a specific model. This request returns a list of “embeddings,” or vector representations of the search term, which is then used in the next step.

Code Snippet

// This function is the endpoint's request handler. 
// It interacts with MongoDB Atlas and OpenAI API for embedding and search functionality.
exports = async function({ query }, response) {
    // Query params, e.g. '?search=test&beds=2' => {search: "test", beds: "2"}
    const { search, beds, rooms, people, maxPrice, freeTextFilter } = query;

// MongoDB Atlas configuration.
    const mongodb = context.services.get('mongodb-atlas');
    const db = mongodb.db('sample_airbnb'); // Replace with your database name.
    const listingsAndReviews = db.collection('listingsAndReviews'); // Replace with your collection name.

// If there's no search query, return a sample of 30 random documents from the collection.
    if (!search || search === "") {
      return await listingsAndReviews.aggregate([{$sample: {size: 30}}]).toArray();
    }

// Fetch the OpenAI key stored in the context values.
    const openai_key = context.values.get("openAIKey");

// URL to make the request to the OpenAI API.
    const url = 'https://api.openai.com/v1/embeddings';

// Call OpenAI API to get the embeddings.
    let resp = await context.http.post({
        url: url,
        headers: {
            'Authorization': [`Bearer ${openai_key}`],
            'Content-Type': ['application/json']
        },
        body: JSON.stringify({
            input: search,
            model: "text-embedding-ada-002"
        })
    });

// Parse the JSON response
    let responseData = EJSON.parse(resp.body.text());

// Check the response status.
    if(resp.statusCode === 200) {
        console.log("Successfully received embedding.");

// Fetch a random sample document.

const embedding = responseData.data[0].embedding;
        console.log(JSON.stringify(embedding))

let searchQ = {
                "index": "default",
                "queryVector": embedding,
                "path": "doc_embedding",
                "k": 100,
                "numCandidates": 1000
       }

// If there's any filter in the query parameters, add it to the search query.
        if (freeTextFilter){
          // Turn free text search using GPT-4 into filter
            const sampleDocs = await listingsAndReviews.aggregate([
            { $sample: { size: 1 }},
            { $project: {
                _id: 0,
                bedrooms: 1,
                beds: 1,
                room_type: 1,
                property_type: 1,
                price: 1,
                accommodates: 1,
                bathrooms: 1,
                review_scores: 1
            }}
        ]).toArray();
        
          const filter =  await context.functions.execute("getSearchAIFilter",sampleDocs[0],freeTextFilter );
           searchQ.filter = filter;
        }
else if(beds || rooms) {
  let filter = { "$and" : []} 
  
   if (beds) {
     filter.$and.push({"beds" : {"$gte" : parseInt(beds) }})
   }
   if (rooms)
   {
     filter.$and.push({"bedrooms" : {"$gte" : parseInt(rooms) }})
   }
    searchQ.filter = filter;
}

// Perform the search with the defined query and limit the result to 50 documents.
        let docs = await listingsAndReviews.aggregate([
            { "$vectorSearch": searchQ },
            { $limit : 50 }
        ]).toArray();

return docs;
    } else {
        console.error("Failed to get embeddings");
        return [];
    }
};

To cover the filtering part of the query, we are using embedding and building a filter query to cover the basic filters that a user might request — in the presented example, two rooms and two beds in each.

Code Snippet

Calling OpenAI API

Let's consider a more advanced use case that can enhance our filtering experience. In this example, we are allowing a user to perform a free-form filtering that can provide sophisticated sentences, such as, “More than 1 bed and rating above 91.”

We call the OpenAI API to interpret the user's free text filter and translate it into something we can use in a MongoDB query. We send the API a description of what we need, based on the document structure we're working with and the user's free text input. This text is fed into the GPT-4 model, which returns a JSON object with 'range' or 'equals' operators that can be used in a MongoDB search query.

getSearchAIFilter function

Code Snippet

// This function is the endpoint's request handler. 
// It interacts with OpenAI API for generating filter JSON based on the input.
exports = async function(sampleDoc, search) {
    // URL to make the request to the OpenAI API.
    const url = 'https://api.openai.com/v1/chat/completions';

// Fetch the OpenAI key stored in the context values.
    const openai_key = context.values.get("openAIKey");

// Convert the sample document to string format.
    let syntDocs = JSON.stringify(sampleDoc);
    console.log(syntDocs);

// Prepare the request string for the OpenAI API.
    const reqString = `Convert programmatic command to Atlas $search filter only for range and equals  JS:\n\nExample: Based on document structure {"siblings" : '...', "dob" : "..."} give me the filter of all people  born  2015 and siblings are 3 \nOutput: {"filter":{ "compound" : { "must" : [ {"range": {"gte": 2015, "lte" : 2015,"path": "dob"} },{"equals" : {"value" : 3 , path :"siblings"}}]}}} \n\n  provide the needed filter to accomodate ${search}, pick a path from structure ${syntDocs}. Need just the json object with a range or equal operators. No explanation. No 'Output:' string in response. Valid JSON.`;
    console.log(`reqString: ${reqString}`);

// Call OpenAI API to get the response.
    let resp = await context.http.post({
        url: url,
        headers: {
            'Authorization': `Bearer ${openai_key}`,
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            model: "gpt-4",
            temperature: 0.1,
            messages: [
                {
                    "role": "system",
                    "content": "Output  filter json generator follow only provided rules"
                },
                {
                    "role": "user",
                    "content": reqString
                }
            ]
        })
    });

// Parse the JSON response
    let responseData = JSON.parse(resp.body.text());

// Check the response status.
    if(resp.statusCode === 200) {
        console.log("Successfully received code.");
        console.log(JSON.stringify(responseData));

const code = responseData.choices[0].message.content;
        let parsedCommand = EJSON.parse(code);
        console.log('parsed' + JSON.stringify(parsedCommand));

// If the filter exists and it's not an empty object, return it.
        if (parsedCommand.filter && Object.keys(parsedCommand.filter).length !== 0) {
            return parsedCommand.filter;
        }
        
        // If there's no valid filter, return an empty object.
        return {};

} else {
        console.error("Failed to generate filter JSON.");
        console.log(JSON.stringify(responseData));
        return {};
    }
};

MongoDB search and filters

The function then constructs a MongoDB search query using the embedded representation of the search term and any additional filters provided by the user. This query is sent to MongoDB, and the function returns the results as a response —something that looks like the following for a search of “New York high floor” and “More than 1 bed and rating above 91.”

Code Snippet

Conclusion

This approach allows us to leverage the power of OpenAI's GPT-4 model to interpret free text input and MongoDB's full-text search capability to return highly relevant search results. The use of natural language processing and AI brings a level of flexibility and intuitiveness to the search function that greatly enhances the user experience.

Remember, however, this is an advanced implementation. Ensure you have a good understanding of how MongoDB and OpenAI operate before attempting to implement a similar solution. Always take care to handle sensitive data appropriately and ensure your AI use aligns with OpenAI's use case policy.

Rate this tutorial

Tutorial

How to Build a RAG System Using Claude 3 Opus And MongoDB

Mar 07, 2024 | 16 min read

Tutorial

Building a Restaurant Locator Using Atlas, Neurelo, and AWS Lambda

Apr 02, 2024 | 8 min read

Tutorial

How to Use PyMongo to Connect MongoDB Atlas with AWS Lambda

Apr 02, 2024 | 6 min read

Tutorial

Calling the MongoDB Atlas Administration API: How to Do it from Node, Python, and Ruby

Apr 13, 2023 | 4 min read

Search in the Airbnb app context
The request handler
Calling OpenAI API
MongoDB search and filters
Conclusion

Atlas

Leveraging OpenAI and MongoDB Atlas for Improved Search Functionality

Search in the Airbnb app context

The request handler

Calling OpenAI API

getSearchAIFilter function

MongoDB search and filters

Conclusion

Related

How to Build a RAG System Using Claude 3 Opus And MongoDB

Building a Restaurant Locator Using Atlas, Neurelo, and AWS Lambda

How to Use PyMongo to Connect MongoDB Atlas with AWS Lambda

Calling the MongoDB Atlas Administration API: How to Do it from Node, Python, and Ruby

Table of Contents