ChatHuggingFace 集成 - Docs by LangChain

本指南将帮助您开始使用 langchain_huggingface 聊天模型. 有关所有 ChatHuggingFace 功能和配置的详细文档，请前往 API reference. For a list of models supported by Hugging Face 查看 this page.

概述

集成详情

类	包	可序列化	JS 支持	下载量	版本
`ChatHuggingFace`	`langchain-huggingface`	beta	❌

模型功能

Tool calling	Structured output	Image input	音频输入	视频输入	Token-level streaming	原生异步	Token usage	Logprobs
✅	✅	✅	✅	✅	❌	✅	✅	❌

设置

要访问 Hugging Face 模型，您需要创建一个 Hugging Face 账户，获取 API 密钥，并安装 langchain-huggingface 集成包。

凭证

Generate a Hugging Face Access Token and store it as an 环境变量： HUGGINGFACEHUB_API_TOKEN.

import getpass
import os

if not os.getenv("HUGGINGFACEHUB_API_TOKEN"):
    os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass.getpass("Enter your token: ")

安装

类	包	可序列化	JS 支持	下载量	版本
`ChatHuggingFace`	`langchain-huggingface`	❌	❌

模型功能

Tool calling	Structured output	Image input	音频输入	视频输入	Token-level streaming	原生异步	Token usage	Logprobs
✅	✅	❌	❌	❌	❌	❌	❌	❌

设置

要访问 langchain_huggingface 模型，您需要创建一个 Hugging Face 账户，获取 API 密钥，并安装 langchain-huggingface 集成包。

凭证

You’ll need to have a Hugging Face Access Token saved as an 环境变量： HUGGINGFACEHUB_API_TOKEN.

import getpass
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass.getpass(
    "Enter your Hugging Face API key: "
)

pip install -qU  langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2 bitsandbytes accelerate

实例化

You can instantiate a ChatHuggingFace model in two different ways, either from a HuggingFaceEndpoint or from a HuggingFacePipeline.

`HuggingFaceEndpoint`

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    repo_id="deepseek-ai/DeepSeek-R1-0528",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
    provider="auto",  # let Hugging Face choose the best provider for you
)

chat_model = ChatHuggingFace(llm=llm)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/isaachershenson/.cache/huggingface/token
Login successful

Now let’s take advantage of Inference Providers to run the model on specific third-party providers

llm = HuggingFaceEndpoint(
    repo_id="deepseek-ai/DeepSeek-R1-0528",
    task="text-generation",
    provider="hyperbolic",  # set your provider here
    # provider="nebius",
    # provider="together",
)

chat_model = ChatHuggingFace(llm=llm)

`HuggingFacePipeline`

from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline

llm = HuggingFacePipeline.from_model_id(
    model_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.03,
    ),
)

chat_model = ChatHuggingFace(llm=llm)

config.json:   0%|          | 0.00/638 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/8 [00:00<?, ?it/s]

model-00001-of-00008.safetensors:   0%|          | 0.00/1.89G [00:00<?, ?B/s]

model-00002-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00003-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00004-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00005-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00006-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00007-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00008-of-00008.safetensors:   0%|          | 0.00/816M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

Instatiating with quantization

To run a quantized version of your model, you can specify a bitsandbytes quantization config as follows:

from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_use_double_quant=True,
)

and pass it to the HuggingFacePipeline as a part of its model_kwargs:

llm = HuggingFacePipeline.from_model_id(
    model_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.03,
        return_full_text=False,
    ),
    model_kwargs={"quantization_config": quantization_config},
)

chat_model = ChatHuggingFace(llm=llm)

调用

from langchain.messages import (
    HumanMessage,
    SystemMessage,
)

messages = [
    SystemMessage(content="You're a helpful assistant"),
    HumanMessage(
        content="What happens when an unstoppable force meets an immovable object?"
    ),
]

ai_msg = chat_model.invoke(messages)

print(ai_msg.content)

According to the popular phrase and hypothetical scenario, when an unstoppable force meets an immovable object, a paradoxical situation arises as both forces are seemingly contradictory. On one hand, an unstoppable force is an entity that cannot be stopped or prevented from moving forward, while on the other hand, an immovable object is something that cannot be moved or displaced from its position.

In this scenario, it is un

API 参考

有关所有 ChatHuggingFace 功能和配置的详细文档，请前往 API reference

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

Documentation Index

​概述

​集成详情

​模型功能

​设置

​凭证

​安装

​模型功能

​设置

​凭证

​实例化

​HuggingFaceEndpoint

​HuggingFacePipeline

​Instatiating with quantization

​调用

​API 参考

概述

集成详情

模型功能

设置

凭证

安装

模型功能

设置

凭证

实例化

`HuggingFaceEndpoint`

`HuggingFacePipeline`

Instatiating with quantization

调用

API 参考