LangChain supports three ways to use Hugging Face 向量嵌入模型:Documentation Index
Fetch the complete documentation index at: https://nvd-54.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
- Local inference via
HuggingFaceEmbeddings: downloads the model and runs it in-process with Sentence Transformers. - Inference Providers and dedicated Inference Endpoints via
HuggingFaceEndpointEmbeddings: serverless or dedicated hosted inference through Hugging Face. - Self-hosted at scale via Text Embeddings Inference (TEI): Hugging Face’s production inference server, pointed at by
HuggingFaceEndpointEmbeddings.
Embeddings interface, so you can start local and graduate to a hosted or self-hosted deployment without changing the rest of your application.
设置
Local embeddings
Generate embeddings locally viasentence-transformers. This downloads the model weights the first time you run it.
Hugging Face Inference Endpoints and Providers
If you prefer not to download models locally, you can access 向量嵌入模型 through Hugging Face Inference Providers or a dedicated Inference Endpoint. Both expose open-source 向量嵌入模型 on Hugging Face’s scalable serverless infrastructure. First, get a token from your Hugging Face settings:HuggingFaceEndpointEmbeddings:
hf-inference, sambanova, together), pass provider=:
Self-hosted with Text Embeddings Inference
For production-scale serving of Sentence Transformers models on your own infrastructure, use Text Embeddings Inference (TEI). TEI handles batching, GPU acceleration, and exposes an OpenAI-compatible API. 请参阅 TEI integration guide for a walkthrough.Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

