Use this file to discover all available pages before exploring further.
This notebook covers how to use MongoDB Vector Search with LangChain. It also showcases the MongoDB Atlas Embedding and Reranking API for accessing Voyage AI’s state-of-the-art embedding models and rerankers.
MongoDB Atlas is a fully-managed cloud database available in AWS, Azure, and GCP. It supports native Vector Search, full text search (BM25), and hybrid search on your MongoDB document data.
MongoDB Vector Search allows to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm (Hierarchical Navigable Small Worlds). It uses the $vectorSearch MQL Stage.
To use MongoDB Atlas, you must first deploy a cluster. To get started, sign up for free to Atlas.In order to use Voyage AI embedding and reranking models, you will need to create a model API key. Generate your API key, and get access to 200 million free tokens on the latest models.First, start by installing the following libraries to follow this notebook.
# Create the vector search index with specific filters for optimized metadata searchingvector_store.create_vector_search_index( dimensions=1024, wait_until_complete=45 # waits 45 seconds for index readiness)
[OPTIONAL] Alternative to the vector_store.create_vector_search_index command above, you can also create the vector search index using the Atlas UI with the following index definition:
First, let’s update the vector search index by providing the field to filter on.
# Create the vector search index with specific filters for optimized metadata searchingvector_store.create_vector_search_index( dimensions=1024, wait_until_complete=60, update=True, filters=["source"])
Narrow down results using metadata filters. Note that Atlas Vector Search requires explicit operators like $eq.
# Use a source known to exist in our loaded docssample_source = "https://en.wikipedia.org/wiki/MongoDB"filtered_results = vector_store.similarity_search( query, k=3, pre_filter={"source": {"$eq": sample_source}})print(f"Retrieved {len(filtered_results)} documents matching source: {sample_source}")
Combining Vector Search with Full-Text Search (Keyword) using Reciprocal Rank Fusion (RRF).
from langchain_mongodb.index import create_fulltext_search_index# Use helper method to create the search indexcreate_fulltext_search_index( collection = atlas_collection, wait_until_complete=60, field = "text", index_name = "search_index")
# Requires a standard Search Index named 'default' on the collectionretriever = MongoDBAtlasHybridSearchRetriever( vectorstore=vector_store, search_index_name="search_index", fulltext_penalty = 50, vector_penalty = 50, top_k=5)hybrid_docs = retriever.invoke("database-as-a-service")for doc in hybrid_docs: print("Title: " + doc.metadata["title"]) print("Plot: " + doc.page_content) print("Search score: {}".format(doc.metadata["fulltext_score"])) print("Vector Search score: {}".format(doc.metadata["vector_score"])) print("Total score: {}\n".format(doc.metadata["fulltext_score"] + doc.metadata["vector_score"]))