Use this file to discover all available pages before exploring further.
Facebook AI Similarity Search (FAISS) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also includes supporting code for evaluation and parameter tuning.See The FAISS Library paper.
from uuid import uuid4from langchain_core.documents import Documentdocument_1 = Document( page_content="I had chocolate chip pancakes and scrambled eggs for breakfast this morning.", metadata={"source": "tweet"},)document_2 = Document( page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.", metadata={"source": "news"},)document_3 = Document( page_content="Building an exciting new project with LangChain - come check it out!", metadata={"source": "tweet"},)document_4 = Document( page_content="Robbers broke into the city bank and stole $1 million in cash.", metadata={"source": "news"},)document_5 = Document( page_content="Wow! That was an amazing movie. I can't wait to see it again.", metadata={"source": "tweet"},)document_6 = Document( page_content="Is the new iPhone worth the price? Read this review to find out.", metadata={"source": "website"},)document_7 = Document( page_content="The top 10 soccer players in the world right now.", metadata={"source": "website"},)document_8 = Document( page_content="LangGraph is the best framework for building stateful, agentic applications!", metadata={"source": "tweet"},)document_9 = Document( page_content="The stock market is down 500 points today due to fears of a recession.", metadata={"source": "news"},)document_10 = Document( page_content="I have a bad feeling I am going to get deleted :(", metadata={"source": "tweet"},)documents = [ document_1, document_2, document_3, document_4, document_5, document_6, document_7, document_8, document_9, document_10,]uuids = [str(uuid4()) for _ in range(len(documents))]vector_store.add_documents(documents=documents, ids=uuids)
results = vector_store.similarity_search( "LangChain provides abstractions to make working with LLMs easy", k=2, filter={"source": "tweet"},)for res in results: print(f"* {res.page_content} [{res.metadata}]")
* Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]* LangGraph is the best framework for building stateful, agentic applications! [{'source': 'tweet'}]
Performing the same above similarity search with advanced metadata filtering can be done as follows:
results = vector_store.similarity_search( "LangChain provides abstractions to make working with LLMs easy", k=2, filter={"source": {"$eq": "tweet"}},)for res in results: print(f"* {res.page_content} [{res.metadata}]")
* Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]* LangGraph is the best framework for building stateful, agentic applications! [{'source': 'tweet'}]
results = vector_store.similarity_search_with_score( "Will it be hot tomorrow?", k=1, filter={"source": "news"})for res, score in results: print(f"* [SIM={score:3f}] {res.page_content} [{res.metadata}]")
* [SIM=0.893688] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]
retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 1})retriever.invoke("Stealing from the bank is a crime", filter={"source": "news"})
[Document(metadata={'source': 'news'}, page_content='Robbers broke into the city bank and stole $1 million in cash.')]