Prakul Agarwal

6 results

Checkpointers and Native Parent Child Retrievers with LangChain and MongoDB

MongoDB and LangChain, the company known for its eponymous large language model (LLM) application framework, are excited to announce new developments in an already strong partnership. Two additional enhancements have just been added to the LangChain codebase, making it easier than ever to build cutting-edge AI solutions with MongoDB. Checkpointer support In LangGraph, LangChain’s library for building stateful, multi-actor applications with LLMs, memory is provided through checkpointers . Checkpointers are snapshots of the graph state at a given point in time. They provide a persistence layer, allowing developers to interact and manage the graph’s state. This has a number of advantages for developers—human-in-the-loop, "memory" between interactions, and more. Figure adapted from “Launching Long-Term Memory Support in LangGraph”. LangChain Blog. Oct. 8, 2024. https://blog.langchain.dev/launching-long-term-memory-support-in-langgraph/ MongoDB has developed a custom checkpointer implementation, the " MongoDBSaver " class, that, with just a MongoDB URI (local or Atlas ), can easily store LangGraph state in MongoDB. By making checkpointers a first-class feature, developers can have confidence that their stateful AI applications built on MongoDB will be performant. That’s not all, since there are actually two new checkpointers as part of this implementation— one synchronous and one asynchronous . This versatility allows the new functionality to be even more versatile, and serving developers with a myriad of use cases. Both implementations include helpful utility functions to make using them painless, letting developers easily store instances of StateGraph inside of MongoDB. A performant persistence layer that stores data in an intuitive way will mean a better end-user experience and a more robust system, no matter what a developer is building with LangGraph. Native parent child retrievers Second, MongoDB has implemented a native parent child retriever inside LangChain. This approach enhances the performance of retrieval methods utilizing the retrieval-augmented Generation (RAG) technique by providing the LLM with a broader context to consider. In essence, we divide the original documents into relatively small chunks, embed each one, and store them in MongoDB. Using such small chunks (a sentence or a couple of sentences) helps the embedding models to better reflect their meaning. Now developers can use " MongoDBAtlasParentDocumentRetriever " to persist one collection for both vector and document storage. In this implementation, we can store both parent and child documents in a single collection while only having to compute and index embedding vectors for the chunks. This has a number of performance advantages because storing vectors with their associated documents means no need to join tables or worry about painful schema migrations. Additionally, as part of this work, MongoDB has also added a " MongoDBDocStore " class which provides many helpful utility functions. It is now easier than ever to use documents as a key-value store and insert, update, and delete them with ease. Taken together, these two new classes allow developers to take full advantage of MongoDB’s abilities. MongoDB and LangChain continue to be a strong pair for building agentic AI—combining performance and ease of development to provide a developer-friendly experience. Stay tuned as we build out additional functionality! To learn more about these LangChain integrations, here are some resources to get you started: Check out our tutorial . Experiment with checkpointers and native parent child retrievers to see their utility for yourself. Read the previous announcement with LangChain about AI Agents, Hybrid Search, and Indexing.

December 16, 2024

Announcing Hybrid Search Support for LlamaIndex

MongoDB is excited to announce enhancements to our LlamaIndex integration. By combining MongoDB’s robust database capabilities with LlamaIndex’s innovative framework for context-augmented large language models (LLMs), the enhanced MongoDB-LlamaIndex integration unlocks new possibilities for generative AI development. Specifically, it supports vector (powered by Atlas Vector Search ), full-text (powered by Atlas Search ), and hybrid search, enabling developers to blend precise keyword matching with semantic search for more context-aware applications, depending on their use case. Building AI applications with LlamaIndex LlamaIndex is one of the world’s leading AI frameworks for building with LLMs. It streamlines the integration of external data sources, allowing developers to combine LLMs with relevant context from various data formats. This makes it ideal for building application features like retrieval-augmented generation (RAG), where accurate, contextual information is critical. LlamaIndex empowers developers to build smarter, more responsive AI systems while reducing the complexities involved in data handling and query management. Advantages of building with LlamaIndex include: Simplified data ingestion with connectors that integrate structured databases, unstructured files, and external APIs, removing the need for manual processing or format conversion. Organizing data into structured indexes or graphs , significantly enhancing query efficiency and accuracy, especially when working with large or complex datasets. An advanced retrieval interface that responds to natural language prompts with contextually enhanced data, improving accuracy in tasks like question-answering, summarization, or data retrieval. Customizable APIs that cater to all skill levels—high-level APIs enable quick data ingestion and querying for beginners, while lower-level APIs offer advanced users full control over connectors and query engines for more complex needs. MongoDB's LlamaIndex integration Developers are able to build powerful AI applications using LlamaIndex as a foundational AI framework alongside MongoDB Atlas as the long term memory database. With MongoDB’s developer-friendly document model and powerful vector search capabilities within MongoDB Atlas, developers can easily store and search vector embeddings for building RAG applications. And because of MongoDB’s low-latency transactional persistence capabilities, developers can do a lot more with MongoDB integration in LlamIndex to build AI applications in an enterprise-grade manner. LlamaIndex's flexible architecture supports customizable storage components, allowing developers to leverage MongoDB Atlas as a powerful vector store and a key-value store. By using Atlas Vector Search capabilities, developers can: Store and retrieve vector embeddings efficiently ( llama-index-vector-stores-mongodb ) Persist ingested documents ( llama-index-storage-docstore-mongodb ) Maintain index metadata ( llama-index-storage-index-store-mongodb ) Store Key-value pairs ( llama-index-storage-kvstore-mongodb ) Figure adapted from Liu, Jerry and Agarwal, Prakul (May 2023). “Build a ChatGPT with your Private Data using LlamaIndex and MongoDB”. Medium. https://medium.com/llamaindex-blog/build-a-chatgpt-with-your-private-data-using-llamaindex-and-mongodb-b09850eb154c Adding hybrid and full-text search support Developers may use different approaches to search for different use cases. Full-text search retrieves documents by matching exact keywords or linguistic variations, making it efficient for quickly locating specific terms within large datasets, such as in legal document review where exact wording is critical. Vector search, on the other hand, finds content that is ‘semantically’ similar, even if it does not contain the same keywords. Hybrid search combines full-text search with vector search to identify both exact matches and semantically similar content. This approach is particularly valuable in advanced retrieval systems or AI-powered search engines, enabling results that are both precise and aligned with the needs of the end-user. It is super simple for developers to try out powerful retrieval capabilities on their data and improve the accuracy of their AI applications with this integration. In the LlamaIndex integration, the MongoDBAtlasVectorSearch class is used for vector search. All you have to do is enable full-text search, using VectorStoreQueryMode.TEXT_SEARCH in the same class. Similarly, to use Hybrid search, enable VectorStoreQueryMode.HYBRID . To learn more, check out the GitHub repository . With the MongoDB-LlamaIndex integration’s support, developers no longer need to navigate the intricacies of Reciprocal Rank Fusion implementation or to determine the optimal way to combine vector and text searches—we’ve taken care of the complexities for you. The integration also includes sensible defaults and robust support, ensuring that building advanced search capabilities into AI applications is easier than ever. This means that MongoDB handles the intricacies of storing and querying your vectorized data, so you can focus on building! We’re excited for you to work with our LlamaIndex integration. Here are some resources to expand your knowledge on this topic: Check out how to get started with our LlamaIndex integration Build a content recommendation system using MongoDB and LlamaIndex with our helpful tutorial Experiment with building a RAG application with LlamaIndex, OpenAI, and our vector database Learn how to build with private data using LlamaIndex, guided by one of its co-founders

October 17, 2024

AI Agents, Hybrid Search, and Indexing with LangChain and MongoDB

Since we announced integration with LangChain last year, MongoDB has been building out tooling to help developers create advanced AI applications with LangChain . With recent releases, MongoDB has made it easier to develop agentic AI applications (with a LangGraph integration), perform hybrid search by combining Atlas Search and Atlas Vector Search , and ingest large-scale documents more effectively. For more on each development—plus new support for the LangChain Indexing API—please read on! The rise of AI agents Agentic applications have emerged as a compelling next step in the development of AI. Imagine an application able to act on its own, working towards complicated goals and drawing on context to create a strategy. These applications leverage large language models (LLMs) to dynamically determine their execution path, breaking free from the constraints of traditional, deterministic logic. Consider an application tasked with answering a question like "In our most profitable market, what is the current weather?" While a traditional retrieval-augmented generation (RAG) app may falter, unable to obtain information about “current weather,” an agentic application shines. The application can intelligently deduce the need for an external API call to obtain current weather information, seamlessly integrating this with data retrieved from a vector search to identify the most profitable market. These systems take action and gather additional information with limited human intervention, supplementing what they already know. Building such a system is easier than ever thanks to MongoDB’s continued work with LangGraph. Unleashing the power of AI agents with LangGraph and MongoDB Because it now offers LangGraph—a framework for performing multi-agent orchestration—LangChain is more effective than ever at simplifying the creation of applications using LLMs, including AI agents. These agents require memory to maintain context across multiple interactions, allowing users to engage with them repeatedly while the agent retains information from previous exchanges. While basic agentic applications can utilize in-memory structures, for more complicated use cases these structures are not sufficient. MongoDB allows developers to build stateful, multi-actor applications with LLMs, storing and retrieving the “checkpoints” needed by LangGraph.js. The new MongoDBSaver class makes integration simpler than ever before, as LangGraph.js is able to utilize historical user interactions to enhance agentic AI. By segmenting this history into checkpoints, the library allows for persistent session memory, easier error recovery, and even the ability to “time travel”—allowing users to jump back in the graph to a previous state to explore alternative execution. The MongoDBSaver class implements all of this functionality right into LangGraph.js, with sensible defaults and MongoDB-specific optimization. To learn more, please visit the source code , the documentation , and our new tutorial (which includes both a written and video version). Improve retrieval accuracy with Hybrid Search Retriever Hybrid search is particularly well-suited for queries that have both semantic and keyword-based components. Let’s look at an example, a query such as "find recent scientific papers about climate change impacts on coral reefs that specifically mention ocean acidification". This query would use a hybrid search approach, combining semantic search to identify papers discussing climate change effects on coral ecosystems, keyword matching to ensure "ocean acidification" is mentioned, and potential date-based filtering or boosting to prioritize recent publications. This combination allows for more comprehensive and relevant results than either semantic or keyword search alone could provide. With our recent release of Retrievers in LangChain-MongoDB, building such advanced retrieval patterns is more accessible than ever. Retrievers are how LangChain integrates external data sources into LLM applications. MongoDB has added two new custom, purpose-built Retrievers to the langchain-mongodb Python package, giving developers a unified way to perform hybrid search and full-text search with sensible defaults and extensive code annotation. These new classes make it easier than ever to use the full capabilities of MongoDB Vector Search with LangChain. The new MongoDBAtlasFullTextSearchRetriever class performs full-text searches using the Best Match 25 (BM25) analyzer. The MongoDBAtlasHybridSearchRetriever class builds on this work, combining the above implementation with vector search, fusing the results with Reciprocal Rank Fusion (RRF) algorithm. The combination of these two techniques is a potent tool for improving the retrieval step of a Retrieval-Augmented Generation (RAG) application, enhancing the quality of the results. To find out more, please dive into the MongoDBAtlasHybridSearchRetriever and MongoDBAtlasFullTextSearchRetriever classes. Seamless synchronization using LangChain Indexing API In addition to these releases, we’re also excited to announce that MongoDB now supports the LangChain Indexing API, allowing for seamless loading and synchronization of documents from any source into MongoDB, leveraging LangChain's intelligent indexing features. This new support will help users avoid duplicate content, minimize unnecessary rewrites, and optimize embedding computations. The LangChain Indexing API's record management system ensures efficient tracking of document writes, computing hashes for each document, and storing essential information like write time and source ID. This feature is particularly valuable for large-scale document processing and retrieval applications, offering flexible cleanup modes to manage documents effectively in MongoDB vector search. To read more about how to use the Indexing API, please visit the LangChain Indexing API Documentation . We’re excited about these LangChain integrations and we hope you are too. Here are some resources to further your learning: Check out our written and video tutorial to walk you through building your own JavaScript AI agent with LangGraph.js and MongoDB. Experiment with Hybrid search retrievers to see the power of Hybrid search for yourself. Read the previous announcement with LangChain about Semantic Caching.

September 12, 2024

Introducing Semantic Caching and a Dedicated MongoDB LangChain Package for Gen AI Apps

We are in an unprecedented time in history where developers can build transformative AI applications quickly, without being AI experts themselves. This ability is enabling new classes of applications that can better serve customers with conversational AI for assistance and automation, advanced reasoning and analysis using AI-powered retrieval, and recommendation systems. Behind this revolution are large language models (LLMs) that can be prompted to solve for a wide range of use cases. However, LLMs have various limitations, like knowledge cutoff and a tendency to hallucinate. To overcome these limitations, they must be integrated with proprietary enterprise data sources to build reliable, relevant, and high-quality generative AI applications. That’s where MongoDB plays a critical role in the modern generative AI stack. Developers use MongoDB Atlas Vector Search as a vital part of the generative AI technique known as retrieval-augmented generation (RAG). RAG is the process of feeding LLMs the supplementary data necessary to ground their responses, ensuring they're dependable and precise. LangChain has been a critical part of this journey since the public launch of Atlas Vector Search, enabling developers to build better retriever systems powered by vector search and store conversation history in the operational database. Today, we are excited to announce support for two enhancements: Semantic cache powered by Atlas vector search, which improves the performance of your apps A dedicated LangChain-MongoDB package for Python and JS/TS developers, enabling them to build advanced applications even more efficiently The MongoDB Atlas integration with LangChain can now power all the database requirements for building modern generative AI applications: vector search, semantic caching (currently only available in Python), and conversation history. Earlier, we announced the launch of MongoDB LangChain Templates , which enable the developers to quickly deploy RAG applications, and provided a reference implementation of a basic RAG template using MongoDB Atlas Vector Search and OpenAI and a more advanced Parent-document Retrieval RAG template using MongoDB Atlas Vector Search. We are excited about our partnership with LangChain and will continue innovating. Improve LLM application performance with semantic cache Semantic cache improves the performance of LLM applications by caching responses based on the semantic meaning or context within the queries themselves. This is different from a traditional cache that works based on exact keyword matching. In the era of LLM the value of semantic cache is increasing tremendously, enabling sophisticated user experiences that closely mimic human interactions. For example, if two different users enter two different prompts, “give me suggestions for a comedy movie” and “recommend a comedy movie”, the semantic cache can understand that the intent behind the queries are same and return a similar response, even though different keywords are used, whereas a traditional cache will fail. Figure 1: Semantic cache using MongoDB Atlas Vector Search Check out this video walkthrough for the semantic cache: Accelerate development with a dedicated package With a dedicated LangChain-MongoDB package, MongoDB is even more deeply integrated with LangChain. The Python and Javascript packages contain the following LangChain Integrations: MongoDBAtlasVectorSearch ( Vector stores ) and MongoDBChatMessageHistory ( Chat Messages Memory ). In addition, the Python package includes the MongoDBAtlasSemanticCache ( LLM Caching ). The new package langchain-mongodb contains all the MongoDB-specific implementations and needs to be installed separately from langchain, which includes all the core abstractions. Earlier, everything was in the same package, making it challenging to correctly version and communicate what version should be used and whether any breaking changes were made. Find out more about the langchain-mongodb package: Python: Source code , LangChain docs , MongoDB docs Javascript: Source code , LangChain.js docs , MongoDB docs Get started today Check out this accompanying tutorial and notebook on building advanced RAG with MongoDB and LangChain, which contains a walkthrough and use cases for using semantic cache, vector search, and chat message history. Check out the “ PDFtoChat ” app to see langchain-mongodb JS in action. It allows you to have a conversation with your proprietary PDFs using AI and is built with MongoDB Atlas, LangChain.js, and TogetherAI. It’s an end-to-end SaaS-in-a-box app and includes user authentication, saving PDFs, and saving chats per PDF. Read the excellent overview of semantic caching using LangChain and MongoDB.

March 20, 2024

Announcing LangChain Templates for MongoDB Atlas

Since announcing the public preview of MongoDB Atlas Vector Search back in June, we’ve seen tremendous adoption by developers working to build AI-powered applications. The ability to store, index, and query vector embeddings right alongside their operational data in a single, unified platform dramatically boosts engineering velocity while keeping their technology footprint streamlined and efficient. Atlas Vector Search is used by developers as a key part of the Retrieval-Augmented Generation (RAG) pattern. RAG is used to feed LLMs with the additional data they need to ground their responses, providing outputs that are reliable, relevant, and accurate for the business. One of the key enabling technologies being used to bring external data into LLMs is LangChain. Just one example is healthcare innovator Inovaare who is building AI with MongoDB and LangChain for document classification, information extraction and enrichment, and chatbots over medical data. Now making it even easier for developers to build AI-powered apps, we are excited to announce our partnership with LangChain in the launch of LangChain Templates ! We have worked with LangChain to create a RAG template using MongoDB Atlas Vector Search and OpenAI . This easy-to-use template can help developers build and deploy a Chatbot application over their own proprietary data. LangChain Templates offer a reference architecture that’s easily deployable as a REST API using LangServe . We have also been working with LangChain to release the latest features of Atlas Vector Search, like the recently announced dedicated vector search aggregation stage $vectorSearch, to both the MongoDB LangChain python integration as well as the MongoDB LangChain Javascript integration . Similarly, we will continue working with LangChain to create more templates, that will allow developers to bring their ideas to production faster. If you’re building AI-powered apps on MongoDB, we’d love to hear from you. Sign up to our AI Innovators program where successful applicants receive no-cost MongoDB Atlas credits to develop apps, access to technical resources, and the opportunity to showcase your work to the broader AI community.

November 2, 2023

MongoDB Atlas Vector Search Makes Real-Time AI a Reality with Confluent

Today, we’re excited to announce our new integration with Confluent Cloud . MongoDB Atlas Vector Search users now have simple access to data streams across their entire business, enabling them to build cutting-edge Generative AI applications that are grounded in a real-time, contextual, and trustworthy knowledge base. Think of an application like ChatGPT, but if it knew everything about your private enterprise data, including constant awareness of what’s happening in the world and your business right now. Atlas Vector Search allows you to search intelligently across any unstructured data, using the power of Large Language models (LLMs). With Confluent’s data streaming platform, you can provide a continuous supply of AI-ready data for the development of sophisticated customer experiences, bridging the gap between legacy data systems and the modern data stack. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. High-value, trusted AI applications require real-time data Real-time AI needs real-time data from across your organization. The promise of real-time AI is only unlocked when models have all the freshest contextual data they need to respond just in time with the most accurate, relevant, and helpful information. However, building these real-time data connections across on-prem, multi-cloud, public, and private cloud environments for AI use cases is not trivial. Traditional data integration and processing tools are batch-based and inflexible, creating an untenable number of tightly coupled point-to-point connections that are hard to scale and lack governance. As a result, the data made available is stale and of low fidelity. This introduces unavoidable latency into the AI application and may outright block implementation altogether. The difficulty in gaining access to high-quality, ready-to-use, contextual, and trustworthy data in real-time is hindering developer agility and the pace of AI innovation. Confluent's data streaming platform fuels MongoDB Atlas Vector Search with real-time data With the MongoDB Kafka Connector , users can easily configure MongoDB Atlas as a destination for customer 360 data from Confluent Cloud. This data is converted into vector embeddings using various machine learning models (OpenAI, HuggingFace, and more) and orchestrated by Atlas Triggers. Then using Atlas Vector Search, this data can be indexed and searched efficiently to power use cases such as semantic search, recommendation engines, Q&A systems, and many others. We demonstrate a Chatbot for e-commerce that will allow users to ask natural language questions to discover what they need and then get recommendations on products to buy that suit their preferences. Some of the data required in this scenario includes the currently available inventory, the shipping options, and their session browsing history. The users can refine their product recommendations using a conversational interface, all the while ensuring that the products being recommended are rooted in real-time data. The benefits of being able to effectively use real-time data are immense, almost critical, in this scenario, since recommending a product that’s not available or can’t be delivered to a user’s location in the time frame they require would mean a lost sale and a dissatisfied customer. The inventory data is rapidly changing - products go in and out of stock constantly. Hence the customer chat/assistant application will need to quickly come up with new sets of recommendations. With Confluent, MongoDB Atlas Vector Search users can break down data silos, promote data reusability, improve engineering agility, and foster greater trust throughout their organization. This allows more teams to securely and confidently unlock the full potential of all their data with MongoDB Atlas Vector Search. Confluent enables organizations to make real-time contextual inferences on an astonishing amount of data by bringing well-curated, trustworthy streaming data to AI systems, vector databases, and AI-powered applications. With easy access to data streams from across their entire business, MongoDB Atlas Vector Search users can now: Create a real-time knowledge base: Build a shared source of real-time truth for all your operational and analytical data, no matter where it lives for sophisticated model building and fine-tuning Bring real-time context at query time: Convert raw data into meaningful chunks with real-time enrichment and continually update your embedding databases for your GenAI use cases Build governed, secured, and trusted AI: Establish data lineage, quality, and traceability, providing all your teams with a clear understanding of data origin, movement, transformations, and usage Experiment, scale, and innovate faster: Reduce innovation friction as new AI apps and models become available. Decouple data from your data science tools and production AI apps to test and build faster MongoDB Atlas Vector Search and Confluent enable simple development of real-time AI applications Our new Confluent integration enables all your teams to tap into a continuously enriched real-time knowledge base, so they can quickly scale and build AI-enabled applications using trusted data streams. Here’s a demo video to demonstrate how this works: Getting started Get started by creating a MongoDB Atlas account if you don't already have one. Just click on “Register.” MongoDB offers a free-forever Atlas cluster in the public cloud service of your choice. To learn more about Atlas Vector Search, visit the product page . Not yet a Confluent customer? Start your free trial of Confluent Cloud today. New sign-ups receive $400 to spend during their first 30 days—no credit card required. Head over to our quick-start guide to get started with Atlas Vector Search today.

September 26, 2023