Supercharge AI Data Management With Knowledge Graphs
WhyHow.AI has built and
open-sourced
a platform using MongoDB, enhancing how organizations leverage knowledge graphs for data management and insights. Integrated with MongoDB, this solution offers a scalable foundation with features like
vector search
and aggregation to support organizations in their AI journey.
Knowledge graphs address the limitations of traditional
retrieval-augmented generation
(RAG) systems, which can struggle to capture intricate relationships and contextual nuances in enterprise data. By embedding rules and relationships into a graph structure, knowledge graphs enable accurate and deterministic retrieval processes. This functionality extends beyond information retrieval: knowledge graphs also serve as foundational elements for enterprise memory, helping organizations maintain structured datasets that support future model training and insights.
WhyHow.AI
enhances this process by offering tools designed to combine
large language model
(LLM) workflows with Python- and JSON-native graph management. Using MongoDB’s robust capabilities, these tools help combine structured and unstructured data and search capabilities, enabling efficient querying and insights across diverse datasets. MongoDB’s modular architecture seamlessly integrates vector retrieval, full-text search, and graph structures, making it an ideal platform for RAG and unlocking the full potential of contextual data.
Check out our
AI Learning Hub
to learn more about building AI-powered apps with MongoDB.
Creating and storing knowledge graphs with WhyHow.AI and MongoDB
Creating effective knowledge graphs for RAG requires a structured approach that combines workflows from LLMs, developers, and nontechnical domain experts. Simply capturing all entities and relationships from text and relying on an LLM to organize the data can lead to a messy retrieval process that lacks utility. Instead, WhyHow.AI advocates for a schema-constrained graph creation method, emphasizing the importance of developing a context-specific schema tailored to the user’s use case. This approach ensures that the knowledge graphs focus on the specific relationships that matter most to the user’s workflow.
Once the knowledge graphs are created, the flexibility of MongoDB’s schema design ensures that users are not confined to rigid structures. This adaptability enables seamless expansion and evolution of knowledge graphs as data and use cases develop. Organizations can rapidly iterate during early application development without being restricted by predefined schemas. In instances where additional structure is required, MongoDB supports schema enforcement, offering a balance between flexibility and data integrity.
For instance, aligning external research with patient records is crucial to delivering personalized healthcare. Knowledge graphs bridge the gap between clinical trials, best practices, and individual patient histories. New clinical guidelines can be integrated with patient records to identify which patients would benefit most from updated treatments, ensuring that the latest practices are applied to individual care plans.
Optimizing knowledge graph storage and retrieval with MongoDB
Harnessing the full potential of knowledge graphs requires both effective creation tools and robust systems for storage and retrieval. Here’s how WhyHow.AI and MongoDB work together to optimize the management of knowledge graphs.
Storing data in MongoDB
WhyHow.AI relies on MongoDB’s document-oriented structure to organize knowledge graph data into modular, purpose-specific collections, enabling efficient and flexible queries. This approach is crucial for managing complex entity relationships and ensuring accurate provenance tracking.
To support this functionality, the WhyHow.AI Knowledge Graph Studio comprises several key components:
Workspaces
separate documents, schemas, graphs, and associated data by project or domain, maintaining clarity and focus.
Chunks
are raw text segments with embeddings for similarity searches, linked to triples and documents to provide evidence and provenance.
Graph collection
stores the knowledge graph along with metadata and schema associations, all organized by workspace for centralized data management.
Schemas
define the entities, relationships, and patterns within graphs, adapting dynamically to reflect new data and keep the graph relevant.
Nodes
represent entities like people, locations, or concepts, each with unique identifiers and properties, forming the graph’s foundation.
Triples
define subject-predicate-object relationships and store embedded vectors for similarity searches, enabling reliable retrieval of relevant facts.
Queries
log user queries, including triple results and metadata, providing an immutable history for analysis and optimization.
Figure 1.
WhyHow.AI platform and knowledge graph illustration.
To enhance data interoperability, MongoDB’s aggregation framework enables efficient linking across collections. For instance, retrieving chunks associated with a specific triple can be seamlessly achieved through an aggregation pipeline, connecting workspaces, graphs, chunks, and document collections into a cohesive data flow.
Querying knowledge graphs
With the representation established, users can perform both structured and unstructured queries with the WhyHow.AI querying system. Structured queries enable the selection of specific entity types and relationships, while unstructured queries enable natural language questions to return related nodes, triples, and linked vector chunks. WhyHow.AI’s query engine embeds triples to enhance retrieval accuracy, bypassing traditional Text2Cypher methods. Through a retrieval engine that embeds triples and enables users to retrieve embedded triples with chunks tied to them, WhyHow.AI uses the best of both structured and unstructured data structures and retrieval patterns. And, with MongoDB’s built-in vector search, users can store and query vectorized text chunks alongside their graph and application data in a single, unified location.
Enabling scalability, portability, and aggregations
MongoDB’s horizontal scalability ensures that knowledge graphs can grow effortlessly alongside expanding datasets. Users can also easily utilize WhyHow.AI's platform to create modular multiagent and multigraph workflows. They can deploy
MongoDB Atlas
on their preferred cloud provider or maintain control by running it in their own environments, gaining flexibility and reliability. As graph complexity increases, MongoDB’s aggregation framework facilitates diverse queries, extracting meaningful insights from multiple datasets with ease.
Providing familiarity and ease of use
MongoDB’s familiarity enables developers to apply their existing expertise without the need to learn new technologies or workflows. With WhyHow.AI and MongoDB, developers can build graphs with JSON data and Python-native APIs, which are perfect for LLM-driven workflows. The same database trusted for years in application development can now manage knowledge graphs, streamlining onboarding and accelerating development timelines.
Taking the next steps
WhyHow.AI’s knowledge graphs overcome the limitations of traditional RAG systems by structuring data into meaningful entities, relationships, and contexts. This enhances retrieval accuracy and decision-making in complex fields. Integrated with MongoDB, these capabilities are amplified through a flexible, scalable foundation featuring modular architecture, vector search, and powerful aggregation. Together, WhyHow.AI and MongoDB help organizations unlock their data’s potential, driving insights and enabling innovative knowledge management solutions.
No matter where you are in your AI journey, MongoDB can help! You can get started with your AI-powered apps by registering for
MongoDB Atlas
and exploring the tutorials available in our
AI Learning Hub
. Otherwise, head over to our
quick-start guide
to get started with MongoDB Atlas Vector Search today.
Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “
Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads?
” presented by MongoDB Field CTO, Rick Houlihan.
If your company is interested in being featured in a story like this, we’d love to hear from you. Reach out to us at
ai_adopters@mongodb.com
.
February 13, 2025