本指南将帮助您开始使用 Google Generative AI 向量嵌入模型 using LangChain. For detailed documentation onDocumentation Index
Fetch the complete documentation index at: https://nvd-54.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
GoogleGenerativeAIEmbeddings 功能和配置选项的详细文档,请参阅 API reference.
概述
gemini-embedding-2-preview natively supports text, image, video, audio, and PDF inputs 通过 Google GenAI SDK’s embed_content() API. However, the LangChain Embeddings interface (embed_query / embed_documents) currently only accepts text inputs. Multimodal embedding support in LangChain is planned for a future release. For multimodal use cases today, use the Google GenAI SDK directly.集成详情
设置
要访问 Google Gemini embedding 模型,您需要创建一个 Google Cloud project, enable the Generative Language API, get an API key, and install thelangchain-google-genai 集成包。
凭证
前往 Google AI Studio to sign up 并生成 API 密钥。 请参阅 Gemini API keys documentation 了解更多详情。 完成后设置GOOGLE_API_KEY 环境变量:
安装
LangChain 的 Google Generative AI 集成位于langchain-google-genai 包中:
实例化
Now we can instantiate our model object and generate embeddings:Reduced dimensionality
gemini-embedding-2-preview supports flexible output dimensions via Matryoshka Representation Learning (MRL). You can reduce dimensionality to optimize storage and latency:
Batch
You can also embed multiple strings at once for a processing speedup:索引与检索
向量嵌入模型常用于检索增强生成 (RAG) 流程中, 既用于索引数据,也用于后续检索数据。 更详细的说明请参阅我们的 RAG tutorials. 下面展示如何使用embeddings 对象来索引和检索数据。 在此示例中,我们将在 InMemoryVectorStore.
Task type
GoogleGenerativeAIEmbeddings optionally support a task_type, which currently must be one of:
SEMANTIC_SIMILARITY: Used to generate embeddings that are optimized to assess text similarity.CLASSIFICATION: Used to generate embeddings that are optimized to classify texts according to preset labels.CLUSTERING: Used to generate embeddings that are optimized to cluster texts based on their similarities.RETRIEVAL_DOCUMENT,RETRIEVAL_QUERY,QUESTION_ANSWERING, andFACT_VERIFICATION: Used to generate embeddings that are optimized for document search or information retrieval.CODE_RETRIEVAL_QUERY: Used to retrieve a code block based on a natural language query, such as sort an array or reverse a linked list. Embeddings of the code blocks are computed usingRETRIEVAL_DOCUMENT.
RETRIEVAL_DOCUMENT in the embed_documents method and RETRIEVAL_QUERY in the embed_query method. If you provide a task type, we will use that for all methods.
Additional configuration
You can pass the following parameters toGoogleGenerativeAIEmbeddings to customize the SDK’s behavior:
base_url: Custom base URL for the API client (e.g., a custom endpoint)output_dimensionality: Reduce the dimensionality of returned embeddings (e.g.,output_dimensionality=256)request_options: Request options dict (e.g.,{"timeout": 10})additional_headers: Additional HTTP headers to include in API requestsclient_args: Additional arguments to pass to the underlying HTTP client
API 参考
For detailed documentation onGoogleGenerativeAIEmbeddings 功能和配置选项的详细文档,请参阅 API reference.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

