Geração aumentada de recuperação com MongoDB e Spring AI: Trazendo AI para seus aplicativos Java

Tim Kelly6 min read • Published Sep 23, 2024 • Updated Sep 23, 2024

Spring IA Java

APLICATIVO COMPLETO

Avaliar este tutorial

AI isso, AI isso. Bem, o que a AI pode realmente fazer pormim? Neste tutorial, vamos discutir como podemos aproveitar nossos próprios dados para aproveitar ao máximo a AI generativa.

And that’s where retrieval-augmented generation (RAG) comes in. It uses AI where it belongs — retrieving the right information and generating smart, context-aware answers. In this tutorial, we’re going to build a RAG app using Spring Boot, MongoDB Atlas, and OpenAI. The full code is available on Github.

O que é geração aumentada de recuperação?

RAG permite que você use dados que não estavam disponíveis para treinar um modelo de AI para preencher seu prompt e, em seguida, usar esses dados para complementar a resposta do modelo de linguagem grande (LLM).

Os LLMs são um tipo de inteligência artificial (AI) que pode gerar e compreender dados. Eles são formados em massivos conjuntos de dados e podem ser usados para responder às suas perguntas de forma informativa.

Embora os LLMs sejam muito poderosos, eles têm algumas limitações. Uma limitação é que nem sempre são precisos ou atualizados. Isso ocorre porque os LLMs são formados em dados que desde então se tornaram desatualizados, incompletos ou não têm conhecimento proprietário sobre um caso de uso ou domínio específico.

Se você tiver dados que precisam permanecer internos por motivos de segurança de dados, ou mesmo apenas perguntas sobre dados mais atualizados, o RAG pode ajudá-lo.

O RAG consiste em três componentes principais:

Seu LLM pré-treinado: é isso que gerará a resposta - OpenAI, em nosso caso.
Pesquisa vetorial (pesquisa semântica): é assim que recuperamos documentos relevantes de nosso banco de banco de dados MongoDB .
Incorporações vetoriais: uma representação numérica de nossos dados captura o significado semântica de nossos dados.

Pré-requisitos

Antes de iniciar este tutorial, verifique se você tem o seguinte instalado e configurado:

Java 21 or higher.
Maven or Gradle (for managing dependencies): We use Maven for this tutorial.
MongoDB Atlas: You’ll need a MongoDB Atlas cluster.
- É necessário um cluster mínimo de10+ para usar o armazenamento de vetores do Spring AI MongoDB , pois ele cria o índice de pesquisa em nosso banco de dados de dados programaticamente.
OpenAI API key: Sign up for OpenAI and obtain an API key.
- Outros modelos estão disponíveis, mas este tutorial usa OpenAI.

Preparando seu projeto

Inicialização do Spring

Para inicializar o projeto:

Go to Inicialização do Spring.
Configure os metadados do projeto :
- Grupo: com.mongodb
- Artefato: RagApp
- Dependencies:
  - Spring Web
  - Banco de dados vetorial do MongoDB Atlas
  - Open AI
Baixe o projeto e abra-o no IDE de sua preferência.

Configuração

Before we do anything, let's go to our pom.xml file and check the Spring AI version is <spring-ai.version>1.0.0-SNAPSHOT</spring-ai.version>. We may need to change it to this, depending on what version of Spring we are using.

A configuração deste projeto envolve a configuração de dois componentes principais:

The EmbeddingModel using OpenAI to generate embeddings for documents.
uma MongoDBAtlasVectorStore to store and manage document vectors for similarity searches.

We’ll need to configure our project to connect to OpenAI and MongoDB Atlas by adding several properties to the application.properties file, along with the necessary credentials.

1 spring.application.name=RagApp  
2   
3 spring.ai.openai.api-key=<Your-API-Key>
4 spring.ai.openai.chat.options.model=gpt-4o  
5   
6 spring.ai.vectorstore.mongodb.initialize-schema=true  
7   
8 spring.data.mongodb.uri=<Your-Connection-URI>
9 spring.data.mongodb.database=rag

You'll see here we have initialize.schema set to True. This creates the index on our collection automatically, using Spring AI. If you are running a free cluster, this is not available. A workaround to this is creating it manually, which you can learn to do in the MongoDB documentation.

Create a config package and add a Config.java to work in. Here’s how the configuration is set up in the Config class:

1 import org.springframework.ai.embedding.EmbeddingModel;
2 import org.springframework.ai.openai.OpenAiEmbeddingModel;
3 import org.springframework.ai.openai.api.OpenAiApi;
4 import org.springframework.ai.vectorstore.MongoDBAtlasVectorStore;
5 import org.springframework.ai.vectorstore.VectorStore;
6 import org.springframework.beans.factory.annotation.Value;
7 import org.springframework.context.annotation.Bean;
8 import org.springframework.context.annotation.Configuration;
9 import org.springframework.data.mongodb.core.MongoTemplate;
10 
11 @Configuration
12 public class Config {
13 
14     @Value("${spring.ai.openai.api-key}")
15     private String openAiKey;
16 
17     @Bean
18     public EmbeddingModel embeddingModel() {
19         return new OpenAiEmbeddingModel(new OpenAiApi(openAiKey));
20     }
21 
22     @Bean
23     public VectorStore mongodbVectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
24         return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel,
25                 MongoDBAtlasVectorStore.MongoDBVectorStoreConfig.builder().build(), true);
26     }
27 
28 }

Essa classe inicializa a conexão com a API OpenAI e configura o armazenamento de vetores baseado no MongoDB para armazenar incorporações de documento .

Incorporando os dados

For this tutorial, we are using the MongoDB/devcenter-articles dataset, available on Hugging Face. This dataset consists of articles from the MongoDB Developer Center. In our resources, create a directory called docs and add our file to read in.

To embed and store data in the vector store, we’ll use a service that reads documents from a JSON file, converts them into embeddings, and stores them in the MongoDB Atlas vector store. This is done using the DocsLoaderService.java that we will create in a service package:

1 package com.mongodb.RagApp.service;
2 
3 import com.fasterxml.jackson.databind.ObjectMapper;
4 import org.springframework.ai.document.Document;
5 import org.springframework.ai.vectorstore.VectorStore;
6 import org.springframework.beans.factory.annotation.Autowired;
7 import org.springframework.core.io.ClassPathResource;
8 import org.springframework.stereotype.Service;
9 
10 import java.io.BufferedReader;
11 import java.io.InputStream;
12 import java.io.InputStreamReader;
13 import java.util.ArrayList;
14 import java.util.List;
15 import java.util.Map;
16 
17 @Service
18 public class DocsLoaderService {
19 
20     private static final int MAX_TOKENS_PER_CHUNK = 2000; 
21     private final VectorStore vectorStore;
22     private final ObjectMapper objectMapper;
23 
24     @Autowired
25     public DocsLoaderService(VectorStore vectorStore, ObjectMapper objectMapper) {
26         this.vectorStore = vectorStore;
27         this.objectMapper = objectMapper;
28     }
29 
30     public String loadDocs() {
31         try (InputStream inputStream = new ClassPathResource("docs/devcenter-content-snapshot.2024-05-20.json").getInputStream();
32              BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream))) {
33 
34             List<Document> documents = new ArrayList<>();
35             String line;
36 
37             while ((line = reader.readLine()) != null) {
38                 Map<String, Object> jsonDoc = objectMapper.readValue(line, Map.class);
39                 String content = (String) jsonDoc.get("body");
40 
41                 // Split the content into smaller chunks if it exceeds the token limit
42                 List<String> chunks = splitIntoChunks(content, MAX_TOKENS_PER_CHUNK);
43 
44                 // Create a Document for each chunk and add it to the list
45                 for (String chunk : chunks) {
46                     Document document = createDocument(jsonDoc, chunk);
47                     documents.add(document);
48                 }
49                 // Add documents in batches to avoid memory overload
50                 if (documents.size() >= 100) {
51                     vectorStore.add(documents);
52                     documents.clear();
53                 }
54             }
55             if (!documents.isEmpty()) {
56                 vectorStore.add(documents);
57             }
58 
59             return "All documents added successfully!";
60         } catch (Exception e) {
61             return "An error occurred while adding documents: " + e.getMessage();
62         }
63     }
64 
65     private Document createDocument(Map<String, Object> jsonMap, String content) {
66         Map<String, Object> metadata = (Map<String, Object>) jsonMap.get("metadata");
67 
68         metadata.putIfAbsent("sourceName", jsonMap.get("sourceName"));
69         metadata.putIfAbsent("url", jsonMap.get("url"));
70         metadata.putIfAbsent("action", jsonMap.get("action"));
71         metadata.putIfAbsent("format", jsonMap.get("format"));
72         metadata.putIfAbsent("updated", jsonMap.get("updated"));
73 
74         return new Document(content, metadata);
75     }
76 
77     private List<String> splitIntoChunks(String content, int maxTokens) {
78         List<String> chunks = new ArrayList<>();
79         String[] words = content.split("\\s+");
80         StringBuilder chunk = new StringBuilder();
81         int tokenCount = 0;
82 
83         for (String word : words) {
84             // Estimate token count for the word (approximated by character length for simplicity)
85             int wordTokens = word.length() / 4;  // Rough estimate: 1 token = ~4 characters
86             if (tokenCount + wordTokens > maxTokens) {
87                 chunks.add(chunk.toString());
88                 chunk.setLength(0); // Clear the buffer
89                 tokenCount = 0;
90             }
91             chunk.append(word).append(" ");
92             tokenCount += wordTokens;
93         }
94         if (chunk.length() > 0) {
95             chunks.add(chunk.toString());
96         }
97         return chunks;
98     }
99 }

Esse serviço lê um arquivo JSON, processa cada documento e o armazena no MongoDB, junto com um vetor incorporado de nosso conteúdo.

Agora, essa é uma abordagem muito simplista de chunking (divisão de documentos grandes em partes menores que permanecem dentro do limite de token e os processam separadamente) implementada. Isso ocorre porque o Go tem um limite de token, então alguns de nossos documentos são grandes demais para serem incorporados de uma só vez. Isso é bom para testes, mas se você estiver mudando para a produção, faça sua pesquisa e decida sua melhor maneira de lidar com esses documentos grandes.

Call this method however you wish, but I created a simple DocsLoaderController in my controller package for testing.

1 import com.mongodb.RagApp.service.DocsLoaderService;
2 import org.springframework.web.bind.annotation.GetMapping;
3 import org.springframework.web.bind.annotation.RequestMapping;
4 import org.springframework.web.bind.annotation.RestController;
5 
6 @RestController
7 @RequestMapping("/api/docs")
8 public class DocsLoaderController {
9 
10     private DocsLoaderService docsLoaderService;
11 
12     public DocsLoaderController(DocsLoaderService docsLoaderService) {
13         this.docsLoaderService = docsLoaderService;
14     }
15 
16     @GetMapping("/load")
17     public String loadDocuments() {
18         return docsLoaderService.loadDocs();
19     }
20 
21 }

Recuperando e aumentando a mencionada geração

Once the data is embedded and stored, we can retrieve it through an API that uses a vector search to return relevant results. The RagController class is responsible for this:

1 import org.springframework.ai.chat.client.ChatClient;  
2 import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;  
3 import org.springframework.ai.vectorstore.SearchRequest;  
4 import org.springframework.ai.vectorstore.VectorStore;  
5 import org.springframework.web.bind.annotation.CrossOrigin;  
6 import org.springframework.web.bind.annotation.GetMapping;  
7 import org.springframework.web.bind.annotation.RequestParam;  
8 import org.springframework.web.bind.annotation.RestController;
9 
10 @RestController
11 public class RagController {
12 
13     private final ChatClient chatClient;
14 
15     public RagController(ChatClient.Builder builder, VectorStore vectorStore) {
16         this.chatClient = builder
17                 .defaultAdvisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()))
18                 .build();
19     }
20 
21     @GetMapping("/question")
22     public String question(@RequestParam(value = "message", defaultValue = "How to analyze time-series data with Python and MongoDB?") String message) {
23         return chatClient.prompt()
24                 .user(message)          
25                 .call()                 
26                 .content();
27     }
28 }

There's a little bit going on here. Let's look at the ChatClient. It offers an API for communicating with our AI model.

O modelo de AI processa dois tipos de mensagens: 1. Mensagens do usuário, que são entradas diretas do usuário. 2. Mensagens do sistema, que são geradas pelo sistema para orientar a conversa.

Para a mensagem do sistema, estamos usando o padrão do QuestionsAnswerAdvisor:

1 private static final String DEFAULT_USER_TEXT_ADVISE = """
2 			Context information is below.
3 			---------------------
4 			{question_answer_context}
5 			---------------------
6 			Given the context and provided history information and not prior knowledge,
7 			reply to the user comment. If the answer is not in the context, inform
8 			the user that you can't answer the question.
9 			""";

But we could edit this message and tailor it to our needs. There are also prompt options that can be specified, such as the temperature setting that controls the randomness or creativity of the generated output. You can find out more from the Spring documentation.

The /question endpoint allows users to ask questions, and it retrieves answers from the vector store by searching against the embedded documents semantically and sends these to the LLM with our context.

Testando a implementação

Para testar nossa implementação:

Inicie o aplicação Spring Boot .
Navigate to http://localhost:8080/api/docs/load to load documents into the vector store.
Use http://localhost:8080/question?message=Your question here to test the question-answer functionality.

Por exemplo, tente perguntar:
http://localhost:8080/question?message=How to analyze time-series data with Python and MongoDB?Explain the steps

Devemos receber uma resposta relevante do aplicativo RAG, formada a partir dos dados do documento incorporado e do LLM.

Conclusão

Neste projeto, integramos um geração aumentada de recuperação (RAG) usando MongoDB, incorporações OpenAI e Spring Boot. O sistema pode incorporar grandes quantidades de dados de documento e responder a perguntas, aproveitando pesquisas de similaridade vetorial de um armazenamento vetorial do MongoDB Atlas .

Next, learn more about what you can do with Java and MongoDB. You might enjoy Armazenamento de mídia contínuo: integrando o armazenamento de Blobs do Azure e o MongoDB com o Spring Boot. Or head over to the community forums and see what other people are doing with MongoDB.

Principais comentários nos fóruns

Ainda não há comentários sobre este artigo.

Iniciar a conversa

Avaliar este tutorial

Relacionado

Tutorial

Terraformando AI fluxos de trabalho de IA: RAG RAG com MongoDB Atlas e Spring AI AI

Jan 29, 2025 | 11 min read

Artigo

Simplificando o desenvolvimento de aplicativos Java com o MongoDB: um guia abrangente para usar testcontainers

Jul 22, 2024 | 7 min read

Tutorial

Java Faceted Full-Text Search API using MongoDB Atlas Search

Jan 17, 2025 | 18 min read

Tutorial

Desbloqueando a pesquisa semântica: crie um mecanismo de pesquisa de filmes baseado em Java com o Atlas Vector Search e o Spring Boot

Sep 18, 2024 | 10 min read

Sumário

O que é geração aumentada de recuperação?
Pré-requisitos
Preparando seu projeto
Recuperando e aumentando a mencionada geração
Conclusão

1	spring.application.name=RagApp
2
3	spring.ai.openai.api-key=<Your-API-Key>
4	spring.ai.openai.chat.options.model=gpt-4o
5
6	spring.ai.vectorstore.mongodb.initialize-schema=true
7
8	spring.data.mongodb.uri=<Your-Connection-URI>
9	spring.data.mongodb.database=rag

1	import org.springframework.ai.embedding.EmbeddingModel;
2	import org.springframework.ai.openai.OpenAiEmbeddingModel;
3	import org.springframework.ai.openai.api.OpenAiApi;
4	import org.springframework.ai.vectorstore.MongoDBAtlasVectorStore;
5	import org.springframework.ai.vectorstore.VectorStore;
6	import org.springframework.beans.factory.annotation.Value;
7	import org.springframework.context.annotation.Bean;
8	import org.springframework.context.annotation.Configuration;
9	import org.springframework.data.mongodb.core.MongoTemplate;
10
11	@Configuration
12	public class Config {
13
14	@Value("${spring.ai.openai.api-key}")
15	private String openAiKey;
16
17	@Bean
18	public EmbeddingModel embeddingModel() {
19	return new OpenAiEmbeddingModel(new OpenAiApi(openAiKey));
20	}
21
22	@Bean
23	public VectorStore mongodbVectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
24	return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel,
25	MongoDBAtlasVectorStore.MongoDBVectorStoreConfig.builder().build(), true);
26	}
27
28	}

1	package com.mongodb.RagApp.service;
2
3	import com.fasterxml.jackson.databind.ObjectMapper;
4	import org.springframework.ai.document.Document;
5	import org.springframework.ai.vectorstore.VectorStore;
6	import org.springframework.beans.factory.annotation.Autowired;
7	import org.springframework.core.io.ClassPathResource;
8	import org.springframework.stereotype.Service;
9
10	import java.io.BufferedReader;
11	import java.io.InputStream;
12	import java.io.InputStreamReader;
13	import java.util.ArrayList;
14	import java.util.List;
15	import java.util.Map;
16
17	@Service
18	public class DocsLoaderService {
19
20	private static final int MAX_TOKENS_PER_CHUNK = 2000;
21	private final VectorStore vectorStore;
22	private final ObjectMapper objectMapper;
23
24	@Autowired
25	public DocsLoaderService(VectorStore vectorStore, ObjectMapper objectMapper) {
26	this.vectorStore = vectorStore;
27	this.objectMapper = objectMapper;
28	}
29
30	public String loadDocs() {
31	try (InputStream inputStream = new ClassPathResource("docs/devcenter-content-snapshot.2024-05-20.json").getInputStream();
32	BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream))) {
33
34	List<Document> documents = new ArrayList<>();
35	String line;
36
37	while ((line = reader.readLine()) != null) {
38	Map<String, Object> jsonDoc = objectMapper.readValue(line, Map.class);
39	String content = (String) jsonDoc.get("body");
40
41	// Split the content into smaller chunks if it exceeds the token limit
42	List<String> chunks = splitIntoChunks(content, MAX_TOKENS_PER_CHUNK);
43
44	// Create a Document for each chunk and add it to the list
45	for (String chunk : chunks) {
46	Document document = createDocument(jsonDoc, chunk);
47	documents.add(document);
48	}
49	// Add documents in batches to avoid memory overload
50	if (documents.size() >= 100) {
51	vectorStore.add(documents);
52	documents.clear();
53	}
54	}
55	if (!documents.isEmpty()) {
56	vectorStore.add(documents);
57	}
58
59	return "All documents added successfully!";
60	} catch (Exception e) {
61	return "An error occurred while adding documents: " + e.getMessage();
62	}
63	}
64
65	private Document createDocument(Map<String, Object> jsonMap, String content) {
66	Map<String, Object> metadata = (Map<String, Object>) jsonMap.get("metadata");
67
68	metadata.putIfAbsent("sourceName", jsonMap.get("sourceName"));
69	metadata.putIfAbsent("url", jsonMap.get("url"));
70	metadata.putIfAbsent("action", jsonMap.get("action"));
71	metadata.putIfAbsent("format", jsonMap.get("format"));
72	metadata.putIfAbsent("updated", jsonMap.get("updated"));
73
74	return new Document(content, metadata);
75	}
76
77	private List<String> splitIntoChunks(String content, int maxTokens) {
78	List<String> chunks = new ArrayList<>();
79	String[] words = content.split("\\s+");
80	StringBuilder chunk = new StringBuilder();
81	int tokenCount = 0;
82
83	for (String word : words) {
84	// Estimate token count for the word (approximated by character length for simplicity)
85	int wordTokens = word.length() / 4; // Rough estimate: 1 token = ~4 characters
86	if (tokenCount + wordTokens > maxTokens) {
87	chunks.add(chunk.toString());
88	chunk.setLength(0); // Clear the buffer
89	tokenCount = 0;
90	}
91	chunk.append(word).append(" ");
92	tokenCount += wordTokens;
93	}
94	if (chunk.length() > 0) {
95	chunks.add(chunk.toString());
96	}
97	return chunks;
98	}
99	}

1	import com.mongodb.RagApp.service.DocsLoaderService;
2	import org.springframework.web.bind.annotation.GetMapping;
3	import org.springframework.web.bind.annotation.RequestMapping;
4	import org.springframework.web.bind.annotation.RestController;
5
6	@RestController
7	@RequestMapping("/api/docs")
8	public class DocsLoaderController {
9
10	private DocsLoaderService docsLoaderService;
11
12	public DocsLoaderController(DocsLoaderService docsLoaderService) {
13	this.docsLoaderService = docsLoaderService;
14	}
15
16	@GetMapping("/load")
17	public String loadDocuments() {
18	return docsLoaderService.loadDocs();
19	}
20
21	}

1	import org.springframework.ai.chat.client.ChatClient;
2	import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
3	import org.springframework.ai.vectorstore.SearchRequest;
4	import org.springframework.ai.vectorstore.VectorStore;
5	import org.springframework.web.bind.annotation.CrossOrigin;
6	import org.springframework.web.bind.annotation.GetMapping;
7	import org.springframework.web.bind.annotation.RequestParam;
8	import org.springframework.web.bind.annotation.RestController;
9
10	@RestController
11	public class RagController {
12
13	private final ChatClient chatClient;
14
15	public RagController(ChatClient.Builder builder, VectorStore vectorStore) {
16	this.chatClient = builder
17	.defaultAdvisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()))
18	.build();
19	}
20
21	@GetMapping("/question")
22	public String question(@RequestParam(value = "message", defaultValue = "How to analyze time-series data with Python and MongoDB?") String message) {
23	return chatClient.prompt()
24	.user(message)
25	.call()
26	.content();
27	}
28	}

1	private static final String DEFAULT_USER_TEXT_ADVISE = """
2	Context information is below.
3	---------------------
4	{question_answer_context}
5	---------------------
6	Given the context and provided history information and not prior knowledge,
7	reply to the user comment. If the answer is not in the context, inform
8	the user that you can't answer the question.
9	""";