使用 LangGraph 构建自定义 RAG 智能体

概述

在本教程中，我们将使用 LangGraph 构建一个检索智能体。 LangChain 提供了内置的智能体实现，使用 LangGraph 原语实现。如果需要更深层次的自定义，可以直接在 LangGraph 中实现智能体。本指南演示了一个检索智能体的示例实现。检索智能体在你希望大语言模型(LLM)决定是从向量存储中检索上下文还是直接回复用户时非常有用。在本教程结束时，我们将完成以下工作：

获取和预处理用于检索的文档。
为语义搜索索引这些文档并为智能体创建检索工具。
构建一个能够决定何时使用检索工具的智能 RAG 系统。

概念

我们将涵盖以下概念：

使用文档加载器、文本分割器、嵌入和向量存储进行检索
LangGraph 图 API，包括状态、节点、边和条件边。

设置

让我们下载所需的包并设置 API 密钥：

npm install @langchain/langgraph @langchain/openai @langchain/community @langchain/textsplitters

1. 预处理文档

获取用于 RAG 系统的文档。我们将使用 Lilian Weng 的优秀博客中最近的三个页面。我们首先使用 CheerioWebBaseLoader 获取页面内容：

import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";

const urls = [
  "https://lilianweng.github.io/posts/2023-06-23-agent/",
  "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
  "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
];

const docs = await Promise.all(
  urls.map((url) => new CheerioWebBaseLoader(url).load()),
);

将获取的文档分割成较小的块以索引到向量存储中：

import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

const docsList = docs.flat();

const textSplitter = new RecursiveCharacterTextSplitter({
  chunkSize: 500,
  chunkOverlap: 50,
});
const docSplits = await textSplitter.splitDocuments(docsList);

2. 创建检索工具

现在我们有了分割后的文档，可以将它们索引到向量存储中用于语义搜索。

使用内存向量存储和 OpenAI 嵌入：

import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const vectorStore = await MemoryVectorStore.fromDocuments(
  docSplits,
  new OpenAIEmbeddings(),
);

const retriever = vectorStore.asRetriever();

使用 LangChain 预构建的 createRetrieverTool 创建检索工具：

import { createRetrieverTool } from "@langchain/classic/tools/retriever";

const tool = createRetrieverTool(
  retriever,
  {
    name: "retrieve_blog_posts",
    description:
      "Search and return information about Lilian Weng blog posts on LLM agents, prompt engineering, and adversarial attacks on LLMs.",
  },
);
const tools = [tool];

3. 生成查询

现在我们将开始构建智能 RAG 图的组件（节点和边）。

构建 generateQueryOrRespond 节点。它将调用大语言模型(LLM)根据当前图状态（消息列表）生成响应。根据输入消息，它将决定使用检索工具进行检索，还是直接回复用户。注意我们通过 .bindTools 让聊天模型访问之前创建的 tools：

import { ChatOpenAI } from "@langchain/openai";
import { GraphNode } from "@langchain/langgraph";

const generateQueryOrRespond: GraphNode<typeof State> = async (state) => {
  const model = new ChatOpenAI({
    model: "gpt-5.4",
    temperature: 0,
  }).bindTools(tools);

  const response = await model.invoke(state.messages);
  return {
    messages: [response],
  };
}

在随机输入上试试：

import { HumanMessage } from "@langchain/core/messages";

const input = { messages: [new HumanMessage("hello!")] };
const result = await generateQueryOrRespond(input);
console.log(result.messages[0]);

输出：

AIMessage {
  content: "Hello! How can I help you today?",
  tool_calls: []
}

提一个需要语义搜索的问题：

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?")
  ]
};
const result = await generateQueryOrRespond(input);
console.log(result.messages[0]);

输出：

AIMessage {
  content: "",
  tool_calls: [
    {
      name: "retrieve_blog_posts",
      args: { query: "types of reward hacking" },
      id: "call_...",
      type: "tool_call"
    }
  ]
}

4. 评估文档

添加一个节点——gradeDocuments——来判断检索到的文档是否与问题相关。我们将使用带有 Zod 结构化输出的模型进行文档评分。我们还将添加一个条件边——checkRelevance——检查评分结果并返回下一个要前往的节点名称（generate 或 rewrite）：

import * as z from "zod";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { GraphNode } from "@langchain/langgraph";
import { AIMessage } from "@langchain/core/messages";

const prompt = ChatPromptTemplate.fromTemplate(
  `You are a grader assessing relevance of retrieved docs to a user question.
  Here are the retrieved docs:
  \n ------- \n
  {context}
  \n ------- \n
  Here is the user question: {question}
  If the content of the docs are relevant to the users question, score them as relevant.
  Give a binary score 'yes' or 'no' score to indicate whether the docs are relevant to the question.
  Yes: The docs are relevant to the question.
  No: The docs are not relevant to the question.`,
);

const gradeDocumentsSchema = z.object({
  binaryScore: z.string().describe("Relevance score 'yes' or 'no'"),
})

const gradeDocuments: GraphNode<typeof State> = async (state) => {
  const model = new ChatOpenAI({
    model: "gpt-5.4",
    temperature: 0,
  }).withStructuredOutput(gradeDocumentsSchema);

  const score = await prompt.pipe(model).invoke({
    question: state.messages.at(0)?.content,
    context: state.messages.at(-1)?.content,
  });

  if (score.binaryScore === "yes") {
    return "generate";
  }
  return "rewrite";
}

使用不相关的文档在工具响应中运行：

import { ToolMessage } from "@langchain/core/messages";

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      tool_calls: [
        {
          type: "tool_call",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          id: "1",
        }
      ]
    }),
    new ToolMessage({
      content: "meow",
      tool_call_id: "1",
    })
  ]
}
const result = await gradeDocuments(input);

确认相关文档被正确分类：

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      tool_calls: [
        {
          type: "tool_call",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          id: "1",
        }
      ]
    }),
    new ToolMessage({
      content: "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
      tool_call_id: "1",
    })
  ]
}
const result = await gradeDocuments(input);

5. 重写问题

构建 rewrite 节点。检索工具可能会返回不相关的文档，这表明需要改进原始用户问题。为此，我们将调用 rewrite 节点：

import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { GraphNode } from "@langchain/langgraph";

const rewritePrompt = ChatPromptTemplate.fromTemplate(
  `Look at the input and try to reason about the underlying semantic intent / meaning. \n
  Here is the initial question:
  \n ------- \n
  {question}
  \n ------- \n
  Formulate an improved question:`,
);

const rewrite: GraphNode<typeof State> = async (state) => {
  const question = state.messages.at(0)?.content;

  const model = new ChatOpenAI({
    model: "gpt-5.4",
    temperature: 0,
  });

  const response = await rewritePrompt.pipe(model).invoke({ question });
  return {
    messages: [response],
  };
}

试试看：

import { HumanMessage, AIMessage, ToolMessage } from "@langchain/core/messages";

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      content: "",
      tool_calls: [
        {
          id: "1",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          type: "tool_call"
        }
      ]
    }),
    new ToolMessage({ content: "meow", tool_call_id: "1" })
  ]
};

const response = await rewrite(input);
console.log(response.messages[0].content);

输出：

What are the different types of reward hacking described by Lilian Weng, and how does she explain them?

6. 生成答案

构建 generate 节点：如果通过了评分检查，我们可以根据原始问题和检索到的上下文生成最终答案：

import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { GraphNode } from "@langchain/langgraph";

const generate: GraphNode<typeof State> = async (state) => {
  const question = state.messages.at(0)?.content;
  const context = state.messages.at(-1)?.content;

  const prompt = ChatPromptTemplate.fromTemplate(
  `You are an assistant for question-answering tasks.
      Use the following pieces of retrieved context to answer the question.
      If you don't know the answer, just say that you don't know.
      Use three sentences maximum and keep the answer concise.
      Question: {question}
      Context: {context}`
  );

  const llm = new ChatOpenAI({
    model: "gpt-5.4",
    temperature: 0,
  });

  const ragChain = prompt.pipe(llm);

  const response = await ragChain.invoke({
    context,
    question,
  });

  return {
    messages: [response],
  };
}

试试看：

import { HumanMessage, AIMessage, ToolMessage } from "@langchain/core/messages";

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      content: "",
      tool_calls: [
        {
          id: "1",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          type: "tool_call"
        }
      ]
    }),
    new ToolMessage({
      content: "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
      tool_call_id: "1"
    })
  ]
};

const response = await generate(input);
console.log(response.messages[0].content);

输出：

Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.

7. 组装图

现在我们将所有节点和边组装成一个完整的图：

从 generateQueryOrRespond 开始，确定是否需要调用检索工具
使用条件边路由到下一步：
- 如果 generateQueryOrRespond 返回了 tool_calls，调用检索工具检索上下文
- 否则，直接回复用户
评估检索到的文档内容与问题的相关性（gradeDocuments）并路由到下一步：
- 如果不相关，使用 rewrite 重写问题，然后再次调用 generateQueryOrRespond
- 如果相关，继续到 generate，使用包含检索文档上下文的 ToolMessage 生成最终响应

import { StateGraph, START, END, ConditionalEdgeRouter } from "@langchain/langgraph";
import { ToolNode } from "@langchain/langgraph/prebuilt";
import { AIMessage } from "langchain";

// 为检索器创建 ToolNode
const toolNode = new ToolNode(tools);

// 辅助函数，确定是否应该检索
const shouldRetrieve: ConditionalEdgeRouter<typeof State, "retrieve"> = (state) => {
  const lastMessage = state.messages.at(-1);
  if (AIMessage.isInstance(lastMessage) && lastMessage.tool_calls.length) {
    return "retrieve";
  }
  return END;
}

// 定义图
const builder = new StateGraph(GraphState)
  .addNode("generateQueryOrRespond", generateQueryOrRespond)
  .addNode("retrieve", toolNode)
  .addNode("gradeDocuments", gradeDocuments)
  .addNode("rewrite", rewrite)
  .addNode("generate", generate)
  // 添加边
  .addEdge(START, "generateQueryOrRespond")
  // 决定是否检索
  .addConditionalEdges("generateQueryOrRespond", shouldRetrieve)
  .addEdge("retrieve", "gradeDocuments")
  // 文档评分后的边
  .addConditionalEdges(
    "gradeDocuments",
    // 基于评分决策路由
    (state) => {
      // gradeDocuments 函数返回 "generate" 或 "rewrite"
      const lastMessage = state.messages.at(-1);
      return lastMessage.content === "generate" ? "generate" : "rewrite";
    }
  )
  .addEdge("generate", END)
  .addEdge("rewrite", "generateQueryOrRespond");

// 编译
const graph = builder.compile();

8. 运行智能 RAG

现在让我们用一个问题测试完整的图：

import { HumanMessage } from "@langchain/core/messages";

const inputs = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?")
  ]
};

for await (const output of await graph.stream(inputs)) {
  for (const [key, value] of Object.entries(output)) {
    const lastMsg = output[key].messages[output[key].messages.length - 1];
    console.log(`节点输出: '${key}'`);
    console.log({
      type: lastMsg._getType(),
      content: lastMsg.content,
      tool_calls: lastMsg.tool_calls,
    });
    console.log("---\n");
  }
}

输出：

节点输出: 'generateQueryOrRespond'
{
  type: 'ai',
  content: '',
  tool_calls: [
    {
      name: 'retrieve_blog_posts',
      args: { query: 'types of reward hacking' },
      id: 'call_...',
      type: 'tool_call'
    }
  ]
}
---

节点输出: 'retrieve'
{
  type: 'tool',
  content: '(Note: Some work defines reward tampering as a distinct category...\n' +
    'At a high level, reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering.\n' +
    '...',
  tool_calls: undefined
}
---

节点输出: 'generate'
{
  type: 'ai',
  content: 'Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.',
  tool_calls: []
}
---

将这些文档连接到 Claude、VSCode 等，通过 MCP 获取实时答案。

在 GitHub 上编辑此页面或提交问题。

Documentation Index

​概述

​概念

​设置

​1. 预处理文档

​2. 创建检索工具

​3. 生成查询

​4. 评估文档

​5. 重写问题

​6. 生成答案

​7. 组装图

​8. 运行智能 RAG

概述

概念

设置

1. 预处理文档

2. 创建检索工具

3. 生成查询

4. 评估文档

5. 重写问题

6. 生成答案

7. 组装图

8. 运行智能 RAG