Skip to main content

Documentation Index

Fetch the complete documentation index at: https://nvd-54.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

本文档将帮助您开始使用 AWS Bedrock 聊天模型. Amazon Bedrock 是一个完全托管的服务 that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don’t have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with. AWS Bedrock maintains a Converse API which provides a unified conversational interface for Bedrock models. This API does not yet support custom models. You can see a list of all models that are supported here.
We recommend the Converse API for users who do not need to use custom models. It can be accessed using ChatBedrockConverse.
Anthropic models on BedrockFor Anthropic models specifically, you can use ChatAnthropicBedrock which extends ChatAnthropic and provides the same API while running on AWS Bedrock. 请参阅 ChatAnthropicBedrock section below for details.
有关所有 Bedrock 功能和配置的详细文档,请前往 API reference.

概述

集成详情

可序列化JS 支持下载量版本
ChatBedrocklangchain-awsbetaPyPI - DownloadsPyPI - Version
ChatBedrockConverselangchain-awsbetaPyPI - DownloadsPyPI - Version

模型功能

The below apply to both ChatBedrock and ChatBedrockConverse.
Tool callingStructured outputImage input音频输入视频输入Token-level streaming原生异步Token usageLogprobs

设置

要访问 Bedrock 模型,您需要创建一个 AWS account, set up the Bedrock API service, get an access key ID and secret key, and install the langchain-aws 集成包。

凭证

Head to the AWS docs 注册 AWS and setup your credentials. Alternatively, ChatBedrockConverse will read from the following environment variables by default:
# os.environ["AWS_ACCESS_KEY_ID"] = "..."
# os.environ["AWS_SECRET_ACCESS_KEY"] = "..."

# Not required unless using temporary credentials.
# os.environ["AWS_SESSION_TOKEN"] = "..."
You’ll also need to turn on model access for your account, which you can do by following these instructions. 要启用模型调用的自动追踪,请设置您的 LangSmith API key:
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("请输入您的 LangSmith API 密钥: ")
os.environ["LANGSMITH_TRACING"] = "true"

安装

LangChain 的 Bedrock 集成位于 langchain-aws 包中:
pip install -qU langchain-aws

实例化

现在我们可以实例化模型对象并生成聊天补全:
from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(
    model_id="anthropic.claude-3-5-sonnet-20240620-v1:0",
    # region_name=...,
    # aws_access_key_id=...,
    # aws_secret_access_key=...,
    # aws_session_token=...,
    # temperature=...,
    # max_tokens=...,
    # 其他参数...
)

调用

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", additional_kwargs={}, response_metadata={'ResponseMetadata': {'RequestId': 'b07d1630-06f2-44b1-82bf-e82538dd2215', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Wed, 16 Apr 2025 19:35:34 GMT', 'content-type': 'application/json', 'content-length': '206', 'connection': 'keep-alive', 'x-amzn-requestid': 'b07d1630-06f2-44b1-82bf-e82538dd2215'}, 'RetryAttempts': 0}, 'stopReason': 'end_turn', 'metrics': {'latencyMs': [488]}, 'model_name': 'anthropic.claude-3-5-sonnet-20240620-v1:0'}, id='run-d09ed928-146a-4336-b1fd-b63c9e623494-0', usage_metadata={'input_tokens': 29, 'output_tokens': 11, 'total_tokens': 40, 'input_token_details': {'cache_creation': 0, 'cache_read': 0}})
print(ai_msg.content)
J'adore la programmation.

流式输出

Note that ChatBedrockConverse emits content blocks while streaming:
for chunk in llm.stream(messages):
    print(chunk)
content=[] additional_kwargs={} response_metadata={} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[{'type': 'text', 'text': 'J', 'index': 0}] additional_kwargs={} response_metadata={} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[{'type': 'text', 'text': "'adore la", 'index': 0}] additional_kwargs={} response_metadata={} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[{'type': 'text', 'text': ' programmation.', 'index': 0}] additional_kwargs={} response_metadata={} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[{'index': 0}] additional_kwargs={} response_metadata={} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[] additional_kwargs={} response_metadata={'stopReason': 'end_turn'} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd'
content=[] additional_kwargs={} response_metadata={'metrics': {'latencyMs': 600}, 'model_name': 'anthropic.claude-3-5-sonnet-20240620-v1:0'} id='run-d0e0836e-7146-4c3d-97c7-ad23dac6febd' usage_metadata={'input_tokens': 29, 'output_tokens': 11, 'total_tokens': 40, 'input_token_details': {'cache_creation': 0, 'cache_read': 0}}
You can filter to text using the text property on the output:
for chunk in llm.stream(messages):
    print(chunk.text, end="|")
|J|'adore la| programmation.||||

Streaming tool calls and structured output

当使用 tool calling or structured output with Anthropic models, tool call arguments stream as partial JSON chunks by default. To reduce latency and get more evenly distributed chunks, you can enable Anthropic’s fine-grained tool streaming beta:
from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(
    model_id="us.anthropic.claude-sonnet-4-5-20250514-v1:0",
    additional_model_request_fields={
        "anthropic_beta": ["fine-grained-tool-streaming-2025-05-14"]
    }
)
Fine-grained tool streaming is supported on Claude 4.5+ models. 请参阅 Claude documentation 了解更多详情。
当使用 fine-grained tool streaming, you may receive invalid or partial JSON inputs. Make sure to account for these edge cases in your code.

Extended thinking

This guide focuses on implementing Extended Thinking using AWS Bedrock with LangChain’s ChatBedrockConverse integration.

Supported models

Extended Thinking is available for the following Claude models on AWS Bedrock:
模型模型 ID
Claude Opus 4anthropic.claude-opus-4-20250514-v1:0
Claude Sonnet 4anthropic.claude-sonnet-4-20250514-v1:0
Claude 3.7 Sonnetus.anthropic.claude-3-7-sonnet-20250219-v1:0
from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
    region_name="us-west-2",
    max_tokens=4096,
    additional_model_request_fields={
        "thinking": {"type": "enabled", "budget_tokens": 1024},
    },
)

ai_msg = llm.invoke(messages)
ai_msg.content_blocks
[{'type': 'reasoning',
  'reasoning': 'The user wants me to translate "I love programming" from English to French.\n\n"I love" translates to "J\'aime" in French.\n"Programming" translates to "la programmation" in French.\n\nSo the full translation would be "J\'aime la programmation."',
  'extras': {'signature': 'EpkDCkgIBxABGAIqQGI0KGz8LoVaFwqSAYPN7N+FecI1ZGtb0zpfPr5F8Sb1yxtQHQlmbKUS8JByenWCFGpRKigNaQh1+rLZ59GEX/sSDB+6gxZAT24DJrq4pxoMySVhzwALI6FEC+1UIjDcozOIznjRTYlDWPcYUNYvpt8rwF9IHE38Ha2uqVY8ROJa1tjOMk3OEnbSoV13Pa8q/gETsz+1UwxNX5tgxOa+38jLEryhdFyyAk2JDLrmluZBM6TMrtyzALQvVbZqjpkKAXdtcVCrsz8zUo/LZT1B/92Ukux2dE0O1ZOdcW3tORK+NFLSBaWuqigcFUTDH9XNQoHd2WpQNhl+ypnCItbL2wDRscN/tEBkgGMQugvPmL0LAuLKBmsRKStKRi/RMYGJb3Ft2yEDsRnYNJBJ6TtgxXFvjDwqc/UaI9cIcTxdoVVlsPFsYccpVwirzwAOiz6CSQ1oOQTYJVT90eQ71QW74n1ubbFIZAvDBKk0KG8jK1FGx4FpuuZyFhBpXtfrgOCdrlVSAO/EE9fKCbP9FlhPbRgB'}},
 {'type': 'text', 'text': "J'aime la programmation."}]

How extended thinking works

When extended thinking is turned on, Claude creates thinking content blocks where it outputs its internal reasoning. Claude incorporates insights from this reasoning before crafting a final response. The API response will include thinking content blocks, followed by text content blocks.
next_messages = messages + [("ai", ai_msg.content), ("human", "I love AI")]

ai_msg = llm.invoke(next_messages)
ai_msg.content_blocks
[{'type': 'reasoning',
  'reasoning': 'The user wants me to translate "I love AI" from English to French. \n\n"I love" translates to "J\'aime" in French.\n"AI" stands for "Artificial Intelligence" which in French is "Intelligence Artificielle" or "IA" (the French abbreviation).\n\nSo the translation would be "J\'aime l\'IA" or "J\'aime l\'intelligence artificielle".\n\nI think using the abbreviation "IA" would be more natural and concise, similar to how the user used "AI" in English.',
  'extras': {'signature': 'EuAECkgIBxABGAIqQLWbkzJ8RzfxhVN1BhfRj5+On8/M9Utt0yH9kvj9P2zlQkO5xloq6I/AiEeArwwdJeqJVcLRjqLtinh6HIBbSDwSDFwt0GL409TqjSZNBhoMPQtJdZmx/uiPrLHUIjCJXyyjgSK3vzbcSEnsvo7pdpoo+waUFrAPDCGL/CIN5u7c8ueLCuCn8W0qGGc+BNgqxQO6UbV11RnMdnUyFmVgTPJErfzBr6U6KyUHd5dJmFWIUVpbbxT2C9vawpbKMPThaRW3BhItEafWGUpPqztzFhqJpSegXtXehIn5iY4yHzTUZ5FPdkNIuAmTsFNNGxiKr9H/gqknvQ2B7I4ushRHLg+drU4cH18EGZlAo5Tu1O9yH5GbweIEew4Uv7oWje+R8TIku0OFVhrbnQqqqukBicMV2JRifUYuz6dYM1UDYS8SfxQ1MmcVY5t1L9LDpoL4F/CtpL8/6YDsB/FosU37Qc1qm+D+pKEPTYnyxaP5tRXqTBfqUIiNJGqr9Egl17Akoy6NIv234rPfuf8HjTcu5scZoPGhOreG5rWxJ7AbTCIXgGWqpcf2TqDtniOac3jW4OtnlID9fsloKNq6Y5twgXHDR47c4Jh6vWmucZiIlL6hkklQzt5To6vOnqcTOGUtuCis8Y2wRzlNGeR2d8A+ocYm7mBvR/Y5DvDgstJwB/vCLoQlIL+jm6+h8k6EX/24GqOsh5hxsS5IsNIob/p8tr4TBbc9noCoUSYkMhbQPi2xpRrNML9GUIo7Skbh1ni67uqeShj1xuUrFG+cN6x4yzDaRb59LCAYAQ=='}},
 {'type': 'text', 'text': "J'aime l'IA."}]

Prompt caching

Bedrock supports caching of elements of your prompts, including messages and tools. This allows you to re-use large documents, instructions, few-shot documents, and other data to reduce latency and costs.
Not all models support prompt caching. See Bedrock prompt caching supported models.
To enable caching on an element of a prompt, mark its associated content block using the cachePoint key. See example below:
import requests
from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(model="us.anthropic.claude-sonnet-4-6")

# Pull LangChain readme
get_response = requests.get(
    "https://raw.githubusercontent.com/langchain-ai/langchain/b476fdb54aa6e6f5f0b24a68c2f4a94e43b369f9/README.md"
)
readme = get_response.text

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's LangChain, according to its README?",
            },
            {
                "type": "text",
                "text": f"{readme}",
            },
            {
                "cachePoint": {"type": "default"},
            },
        ],
    },
]

response_1 = llm.invoke(messages)
response_2 = llm.invoke(messages)

usage_1 = response_1.usage_metadata["input_token_details"]
usage_2 = response_2.usage_metadata["input_token_details"]

print(f"First invocation:\n{usage_1}")
print(f"\nSecond:\n{usage_2}")
First invocation:
{'cache_creation': 1528, 'cache_read': 0}

Second:
{'cache_creation': 0, 'cache_read': 1528}

Citations

Citations can be generated if they are enabled on input documents. Documents can be specified in Bedrock’s native format or LangChain’s standard types:
from langchain_aws import ChatBedrockConverse

llm = ChatBedrockConverse(model="us.anthropic.claude-sonnet-4-20250514-v1:0")

pdf_path = "path/to/your/file.pdf"

with open(pdf_path, "rb") as f:
    pdf_bytes = f.read()

document = {
    "document": {
        "format": "pdf",
        "source": {"bytes": pdf_bytes},
        "name": "my-pdf",
        "citations": {"enabled": True},
    },
}

response = llm.invoke(
    [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this document."},
                document,
            ]
        },
    ]
)
response.content_blocks

ChatAnthropicBedrock

For AWS Bedrock users specifically interested in Anthropic models, langchain-aws provides ChatAnthropicBedrock. This class extends ChatAnthropic and provides the same interface while running on AWS Bedrock infrastructure. This takes advantage of the Anthropic SDK’s Bedrock clients.

安装

Install langchain-aws with the anthropic extra to get the required dependencies:
pip install --upgrade "langchain-aws[anthropic]"

使用方法

ChatAnthropicBedrock supports the same features and parameters as ChatAnthropic. You can initialize it with AWS-specific parameters:
from langchain_aws import ChatAnthropicBedrock

model = ChatAnthropicBedrock(
    model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
    region_name="us-west-2",
    aws_access_key_id="...",
    aws_secret_access_key="...",
    aws_session_token="...",
)
AWS credentials can also be read from environment variables or discovered automatically by boto3:
# Set environment variables
# os.environ["AWS_ACCESS_KEY_ID"] = "..."
# os.environ["AWS_SECRET_ACCESS_KEY"] = "..."
# os.environ["AWS_REGION"] = "..."

from langchain_aws import ChatAnthropicBedrock

model = ChatAnthropicBedrock(model="us.anthropic.claude-haiku-4-5-20251001-v1:0")
For detailed documentation on available parameters and features, refer to the ChatAnthropic integration page.

API 参考

有关所有 ChatBedrock, ChatBedrockConverse, and ChatAnthropicBedrock features and configurations head to the API reference.