LangChain과 Atlas Vector Search 통합

이 페이지의 내용

설치 및 설정

벡터 스토어
Retrievers
LLM 캐시
문서 로더
채팅 기록
바이너리 스토리지
추가 리소스

LangChain과 Atlas Vector Search를 통합하여 생성형 인공지능과 RAG 애플리케이션을 구축할 수 있습니다. 이 페이지에서는 MongoDB LangChain Python 통합과 애플리케이션에서 사용할 수 있는 다양한 구성 요소에 대한 개요를 제공합니다.

시작하기

참고

For a full list of components and methods, see API reference.

JavaScript 통합에 대해서는 LangChain JS/TS 통합 시작하기를 참조하세요.

설치 및 설정

Atlas Vector Search를 LangChain과 함께 사용하려면 먼저 langchain-mongodb 패키지를 설치해야 합니다.

pip install langchain-mongodb

일부 구성 요소에는 다음과 같은 LangChain 기본 패키지도 필요합니다.

pip install langchain langchain_community

벡터 스토어

MongoDBAtlasVectorSearch 벡터 스토어입니다. Atlas의 컬렉션에서 벡터 임베딩을 저장하고 조회할 수 있습니다. 데이터의 임베딩을 저장하고 Atlas Vector Search를 사용하여 조회하는 데 이 구성 요소를 사용할 수 있습니다.

이 구성 요소에는 Atlas Vector Search 인덱스가 필요합니다.

사용법

from langchain_mongodb.vectorstores import MongoDBAtlasVectorSearch
from pymongo import MongoClient
# Use some embedding model to generate embeddings
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
# Connect to your Atlas cluster
client = MongoClient("<connection-string>")
collection = client["<database-name>"]["<collection-name>"]
# Instantiate the vector store
vector_store = MongoDBAtlasVectorSearch(
   collection = collection         # Collection to store embeddings
   embedding = FakeEmbeddings(),   # Embedding model to use
   index_name = "vector_index",    # Name of the vector search index
   relevance_score_fn = "cosine"   # Similarity score function, can also be "euclidean" or "dotProduct"
)

참고

Retrievers

LangChain 검색기는 벡터 저장소에서 관련 문서를 가져오는 데 사용하는 구성 요소입니다. LangChain의 내장 검색기 또는 다음과 같은 MongoDB 검색기를 사용하여 Atlas에서 데이터를 쿼리하고 조회할 수 있습니다.

전체 텍스트 검색기

MongoDBAtlasFullTextSearchRetriever Atlas Search를 사용하여 전체 텍스트 검색을 수행하는 검색기입니다. 특히 Lucene의 표준 BM25 알고리즘을 사용합니다.

이 검색기에는 Atlas Search 인덱스가 필요합니다.

사용법

from langchain_mongodb.retrievers.full_text_search import MongoDBAtlasFullTextSearchRetriever
# Connect to your Atlas cluster
client = MongoClient("<connection-string>")
collection = client["<database-name>"]["<collection-name>"]
# Initialize the retriever
retriever = MongoDBAtlasFullTextSearchRetriever(
   collection = collection,           # MongoDB Collection in Atlas
   search_field = "<field-name>",     # Name of the field to search
   search_index_name = "<index-name>" # Name of the search index
)
# Define your query
query = "some search query"
# Print results
documents = retriever.invoke(query)
for doc in documents:
   print(doc)

참고

API 참조

하이브리드 검색 검색기

MongoDBAtlasHybridSearchRetriever RRF(Reciprocal Rank Fusion) 알고리즘을 사용하여 벡터 검색과 전체 텍스트 검색 결과를 결합하는 검색기입니다. 자세한 내용은 하이브리드 검색 수행 방법을 참조하세요.

이 검색기에는 기존 벡터 저장소, Atlas Vector Search 인덱스, Atlas Search 인덱스가 필요합니다.

사용법

from langchain_mongodb.retrievers.hybrid_search import MongoDBAtlasHybridSearchRetriever
# Initialize the retriever
retriever = MongoDBAtlasHybridSearchRetriever(
   vectorstore = <vector-store>,        # Vector store instance
   search_index_name = "<index-name>",  # Name of the Atlas Search index
   top_k = 5,                           # Number of documents to return
   fulltext_penalty = 60.0,             # Penalty for full-text search
   vector_penalty = 60.0                # Penalty for vector search
)
# Define your query
query = "some search query"
# Print results
documents = retriever.invoke(query)
for doc in documents:
   print(doc)

참고

LLM 캐시

캐시는 유사하거나 반복적인 쿼리에 대한 반복적인 응답을 저장하여 다시 계산하지 않고도 LLM 성능을 최적화하는 데 사용됩니다. MongoDB는 LangChain 애플리케이션에 대해 다음과 같은 캐시를 제공합니다.

MongoDB 캐시

MongoDBCache Atlas에 기본 캐시를 저장할 수 있습니다.

사용법

from langchain_mongodb import MongoDBCache
from langchain_core.globals import set_llm_cache
set_llm_cache(MongoDBCache(
   connection_string = "<connection-string>", # Atlas connection string
   database_name = "<database-name>",         # Database to store the cache
   collection_name = "<collection-name>"      # Collection to store the cache
))

참고

시맨틱 캐시

시맨틱 캐싱은 사용자 입력과 캐시된 결과 간의 시맨틱 유사성을 기반으로 캐시된 프롬프트를 조회하는 발전된 형태의 캐싱입니다.

MongoDBAtlasSemanticCache Atlas Vector Search를 사용하여 캐시된 프롬프트를 조회하는 시맨틱 캐시입니다. 이 구성 요소에는 Atlas Vector Search 인덱스가 필요합니다.

사용법

from langchain_mongodb import MongoDBAtlasSemanticCache
from langchain_core.globals import set_llm_cache
# Use some embedding model to generate embeddings
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
set_llm_cache(MongoDBAtlasSemanticCache(
   embedding = FakeEmbeddings(),              # Embedding model to use
   connection_string = "<connection-string>", # Atlas connection string
   database_name = "<database-name>",         # Database to store the cache
   collection_name = "<collection-name>"      # Collection to store the cache
))

참고

문서 로더

문서 로더 LangChain 애플리케이션에 데이터를 로드하는 데 도움이 되는 도구입니다.

MongodbLoader MongoDB 데이터베이스에서 문서 목록을 반환하는 문서 로더입니다.

사용법

from langchain_community.document_loaders.mongodb import MongodbLoader
loader = MongodbLoader(
   connection_string = "<connection-string>",  # Atlas cluster or local MongoDB instance URI
   db_name = "<database-name>",                # Database that contains the collection
   collection_name = "<collection-name>",      # Collection to load documents from
   filter_criteria = { <filter-document> },    # Optional document to specify a filter
   field_names = ["<field-name>", ... ]        # List of fields to return
)
docs = loader.load()

참고

채팅 기록

MongoDBChatMessageHistory MongoDB 데이터베이스에서 채팅 메시지 기록을 저장하고 관리할 수 있는 구성 요소입니다. 고유 세션 식별자와 연결된 사용자 메시지와 AI가 생성한 메시지를 모두 저장할 수 있습니다. 챗봇과 같이 시간 경과에 따른 상호 작용을 추적해야 하는 애플리케이션에 유용합니다.

사용법

from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
chat_message_history = MongoDBChatMessageHistory(
   session_id = "<session-id>",               # Unique session identifier
   connection_string = "<connection-string>", # Atlas cluster or local MongoDB instance URI
   database_name = "<database-name>",         # Database to store the chat history
   collection_name = "<collection-name>"      # Collection to store the chat history
)
chat_message_history.add_user_message("Hello")
chat_message_history.add_ai_message("Hi")

chat_message_history.messages

[HumanMessage(content='Hello'), AIMessage(content='Hi')]

참고

바이너리 스토리지

MongoDBByteStore 바이너리 데이터, 특히 바이트로 표시되는 데이터를 MongoDB를 사용하여 저장하고 관리하는 사용자 지정 데이터스토어입니다. 키가 문자열이고 값이 바이트 시퀀스인 키-값 쌍을 사용하여 CRUD 작업을 수행할 수 있습니다.

사용법

from langchain.storage import MongoDBByteStore
# Instantiate the MongoDBByteStore
mongodb_store = MongoDBByteStore(
   connection_string = "<connection-string>",  # Atlas cluster or local MongoDB instance URI
   db_name = "<database-name>",                # Name of the database
   collection_name = "<collection-name>"       # Name of the collection
)
# Set values for keys
mongodb_store.mset([("key1", b"hello"), ("key2", b"world")])
# Get values for keys
values = mongodb_store.mget(["key1", "key2"])
print(values)  # Output: [b'hello', b'world']
# Iterate over keys
for key in mongodb_store.yield_keys():
   print(key)  # Output: key1, key2
# Delete keys
mongodb_store.mdelete(["key1", "key2"])

참고

API 참조

추가 리소스

MongoDB는 다음과 같은 개발자 리소스도 제공합니다.

돌아가기

AI 통합

시작하기