沙箱 - Docs by LangChain

智能体生成代码、与文件系统交互并运行 shell 命令。由于我们无法预测智能体可能做什么，因此确保其环境隔离非常重要，使其无法访问凭证、文件或网络。沙箱通过在智能体的执行环境和你的主机系统之间创建边界来提供这种隔离。 In Deep Agents, sandboxes are backends that define the environment where the agent operates. Unlike other backends (State, Filesystem, Store) which only expose file operations, sandbox backends also give the agent an execute tool for running shell commands. When you configure a sandbox backend, the agent gets:

All standard filesystem tools (ls, read_file, write_file, edit_file, glob, grep)
The execute tool for running arbitrary shell commands in the sandbox
A secure boundary that protects your host system

为什么使用沙箱？

沙箱用于安全目的。它们允许智能体执行任意代码、访问文件和使用网络，而不会危及你的凭证、本地文件或主机系统。当智能体自主运行时，这种隔离是必不可少的。 Sandboxes are especially useful for:

Coding agents: Agents that run autonomously can use shell, git, clone repositories (many providers offer native git APIs, e.g., Daytona’s git operations), and run Docker-in-Docker for build and test pipelines
Data analysis agents—Load files, install data analysis libraries (pandas, numpy, etc.), run statistical calculations, and create outputs like PowerPoint presentations in a safe, isolated environment

Using the Deep Agents CLI? The CLI has built-in sandbox support via the --sandbox flag. See Use remote sandboxes for CLI-specific setup, flags (--sandbox-id, --sandbox-setup), and examples.

基本用法

These examples assume you have already created a sandbox/devbox using the provider’s SDK and have credentials set up. For signup, authentication, and provider-specific lifecycle details, see Available providers.

Modal
Runloop
Daytona
LangSmith

pip install langchain-modal

import modal
from deepagents import create_deep_agent
from langchain_anthropic import ChatAnthropic
from langchain_modal import ModalSandbox

app = modal.App.lookup("your-app")
modal_sandbox = modal.Sandbox.create(app=app)
backend = ModalSandbox(sandbox=modal_sandbox)

agent = create_deep_agent(
    model=ChatAnthropic(model="claude-sonnet-4-6"),
    system_prompt="You are a Python coding assistant with sandbox access.",
    backend=backend,
)
try:
    result = agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Create a small Python package and run pytest",
                }
            ]
        }
    )
finally:
    modal_sandbox.terminate()

pip install langchain-runloop

import os

from deepagents import create_deep_agent
from langchain_anthropic import ChatAnthropic
from langchain_runloop import RunloopSandbox
from runloop_api_client import RunloopSDK

client = RunloopSDK(bearer_token=os.environ["RUNLOOP_API_KEY"])

devbox = client.devbox.create()
backend = RunloopSandbox(devbox=devbox)

agent = create_deep_agent(
    model=ChatAnthropic(model="claude-sonnet-4-6"),
    system_prompt="You are a Python coding assistant with sandbox access.",
    backend=backend,
)

try:
    result = agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Create a small Python package and run pytest",
                }
            ]
        }
    )
finally:
    devbox.shutdown()

pip install langchain-daytona

from daytona import Daytona
from deepagents import create_deep_agent
from langchain_anthropic import ChatAnthropic
from langchain_daytona import DaytonaSandbox

sandbox = Daytona().create()
backend = DaytonaSandbox(sandbox=sandbox)

agent = create_deep_agent(
    model=ChatAnthropic(model="claude-sonnet-4-6"),
    system_prompt="You are a Python coding assistant with sandbox access.",
    backend=backend,
)

try:
    result = agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Create a small Python package and run pytest",
                }
            ]
        }
    )
finally:
    sandbox.stop()

LangSmith sandboxes are currently in private beta.

pip install "langsmith[sandbox]"

from deepagents import create_deep_agent
from deepagents.backends import LangSmithSandbox
from langchain_anthropic import ChatAnthropic
from langsmith.sandbox import SandboxClient

client = SandboxClient()
ls_sandbox = client.create_sandbox(template_name="my-template")
backend = LangSmithSandbox(sandbox=ls_sandbox)

agent = create_deep_agent(
    model=ChatAnthropic(model="claude-sonnet-4-6"),
    system_prompt="You are a Python coding assistant with sandbox access.",
    backend=backend,
)
try:
    result = agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Create a small Python package and run pytest",
                }
            ]
        }
    )
finally:
    client.delete_sandbox(ls_sandbox.name)

可用提供商

For provider-specific setup, authentication, and lifecycle details, see sandbox integrations. Don’t see your provider? You can implement your own sandbox backend. See Contributing a sandbox integration.

生命周期和作用域

Most applications choose either one sandbox per thread (thread-scoped) or one shared sandbox for every thread on the same assistant (assistant-scoped). Sandboxes consume resources and cost money until they are shut down. Make sure you shut sandboxes down once they are no longer in use. For the full lifecycle table, async graph factory notes, TTL behavior, LangGraph Deployment wiring, and client-side examples, see Sandbox lifecycle in Going to production.

Thread-scoped (default)

Each conversation gets its own sandbox. The first run creates it; follow-up turns on the same thread reuse it. When the thread ends or the sandbox TTL expires, the environment goes away. Store the mapping with provider labels or metadata as in the following example so each run resolves to the same sandbox.

When users can return after idle time, configure a TTL on the sandbox so the provider deletes or archives idle environments automatically.

Python
TypeScript

agent.py

from daytona import CreateSandboxFromSnapshotParams, Daytona
from deepagents import create_deep_agent
from langchain_core.runnables import RunnableConfig
from langchain_daytona import DaytonaSandbox

client = Daytona()


async def agent(config: RunnableConfig):
    thread_id = config["configurable"]["thread_id"]
    try:
        sandbox = await client.find_one(labels={"thread_id": thread_id})
    except Exception:
        sandbox = await client.create(
            CreateSandboxFromSnapshotParams(
                labels={"thread_id": thread_id},
                auto_delete_interval=3600,  # TTL: clean up when idle
            )
        )
    return create_deep_agent(
        model="google_genai:gemini-3.1-pro-preview",
        backend=DaytonaSandbox(sandbox=sandbox)
    )

src/agent.ts

import { Daytona } from "@daytonaio/sdk";
import { DaytonaSandbox } from "@langchain/daytona";
import { createDeepAgent } from "deepagents";
import type { LangGraphRunnableConfig } from "@langchain/langgraph";

const client = new Daytona();

export async function agent(config: LangGraphRunnableConfig) {
  const threadId = config.configurable?.thread_id as string;
  let sandbox;
  try {
    sandbox = await client.findOne({ labels: { thread_id: threadId } });
  } catch {
    sandbox = await client.create({
      labels: { thread_id: threadId },
      autoDeleteInterval: 3600, // TTL: clean up when idle
    });
  }
  return createDeepAgent({
    model: "google_genai:gemini-3.1-pro-preview",
    backend: await DaytonaSandbox.fromId(sandbox.id),
  });
}

Assistant-scoped

Every thread on the same assistant reuses one sandbox. Files, installed packages, and cloned repositories persist across conversations.

Assistant-scoped sandboxes accumulate in-sandbox state over time. Configure a TTL with your sandbox provider, use snapshots to reset periodically, or implement cleanup logic so disk and memory do not grow without bound.

Python
TypeScript

agent.py

from daytona import CreateSandboxFromSnapshotParams, Daytona
from deepagents import create_deep_agent
from langchain_core.runnables import RunnableConfig
from langchain_daytona import DaytonaSandbox

client = Daytona()


async def agent(config: RunnableConfig):
    assistant_id = config["configurable"]["assistant_id"]
    try:
        sandbox = await client.find_one(labels={"assistant_id": assistant_id})
    except Exception:
        sandbox = await client.create(
            CreateSandboxFromSnapshotParams(labels={"assistant_id": assistant_id})
        )
    return create_deep_agent(
        model="google_genai:gemini-3.1-pro-preview",
        backend=DaytonaSandbox(sandbox=sandbox)
    )

src/agent.ts

import { Daytona } from "@daytonaio/sdk";
import { DaytonaSandbox } from "@langchain/daytona";
import { createDeepAgent } from "deepagents";
import type { LangGraphRunnableConfig } from "@langchain/langgraph";

const client = new Daytona();

export async function agent(config: LangGraphRunnableConfig) {
  const assistantId = config.configurable?.assistant_id as string;
  let sandbox;
  try {
    sandbox = await client.findOne({ labels: { assistant_id: assistantId } });
  } catch {
    sandbox = await client.create({ labels: { assistant_id: assistantId } });
  }
  return createDeepAgent({
    model: "google_genai:gemini-3.1-pro-preview",
    backend: await DaytonaSandbox.fromId(sandbox.id),
  });
}

For manual create, execute, and teardown outside a graph factory, see Basic usage and sandbox integrations for provider-specific APIs.

Integration patterns

There are two architecture patterns for integrating agents with sandboxes, based on where the agent runs.

Agent in sandbox pattern

The agent runs inside the sandbox and you communicate with it over the network. You build a Docker or VM image with your agent framework pre-installed, run it inside the sandbox, and connect from outside to send messages. Benefits:

✅ Mirrors local development closely.
✅ Tight coupling between agent and environment.

Trade-offs:

🔴 API keys must live inside the sandbox (security risk).
🔴 Updates require rebuilding images.
🔴 Requires infrastructure for communication (WebSocket or HTTP layer).

To run an agent in a sandbox, build an image and install deepagents on it.

FROM python:3.11
RUN pip install deepagents-cli

Then run the agent inside the sandbox. To use the agent inside the sandbox you have to add additional infrastructure to handle communication between your application and the agent inside the sandbox.

沙箱即工具模式

The agent runs on your machine or server. When it needs to execute code, it calls sandbox tools (such as execute, read_file, or write_file) which invoke the provider’s APIs to run operations in a remote sandbox. Benefits:

✅ Update agent code instantly without rebuilding images.
✅ Cleaner separation between agent state and execution.
- API keys stay outside the sandbox.
- Sandbox failures don’t lose agent state.
- Option to run tasks in multiple sandboxes in parallel.
✅ Pay only for execution time.

Trade-offs:

🔴 Network latency on each execution call.

Example

from daytona import Daytona
from deepagents import create_deep_agent
from dotenv import load_dotenv
from langchain_daytona import DaytonaSandbox


load_dotenv()

# Can also do this with AgentCore, E2B, Runloop, Modal
sandbox = Daytona().create()
backend = DaytonaSandbox(sandbox=sandbox)

agent = create_deep_agent(
    model="google_genai:gemini-3.1-pro-preview",
    backend=backend,
    system_prompt="You are a coding assistant with sandbox access. You can create and run code in the sandbox.",
)

try:
    result = agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Create a hello world Python script and run it",
                }
            ]
        }
    )
    print(result["messages"][-1].content)
except Exception:
    # Optional: delete the sandbox proactively on an exception
    sandbox.stop()
    raise

The examples in this doc use the sandbox as a tool pattern. Choose the agent in sandbox pattern when your provider’s SDK handles the communication layer and you want production to mirror local development. Choose the sandbox as tool pattern when you need to iterate quickly on agent logic, keep API keys outside the sandbox, or prefer cleaner separation of concerns.

How sandboxes work

Isolation boundaries

All sandbox providers protect your host system from the agent’s filesystem and shell operations. The agent cannot read your local files, access environment variables on your machine, or interfere with other processes. However, sandboxes alone do not protect against:

Context injection: An attacker who controls part of the agent’s input can instruct it to run arbitrary commands inside the sandbox. The sandbox is isolated, but the agent has full control within it.
Network exfiltration: Unless network access is blocked, a context-injected agent can send data out of the sandbox over HTTP or DNS. Some providers support blocking network access (e.g., blockNetwork: true on Modal).

See security considerations for how to handle secrets and mitigate these risks.

The `execute` method

Sandbox backends have a simple architecture: the only method a provider must implement is execute(), which runs a shell command and returns its output. Every other filesystem operation (read, write, edit, ls, glob, grep) is built on top of execute() by the BaseSandbox base class, which constructs scripts and runs them inside the sandbox via execute(). This design means:

Adding a new provider is straightforward. Implement execute()—the base class handles everything else.
The execute tool is conditionally available. On every model call, the harness checks whether the backend implements SandboxBackendProtocol. If not, the tool is filtered out and the agent never sees it.

When the agent calls the execute tool, it provides a command string and gets back the combined stdout/stderr, exit code, and a truncation notice if the output was too large. You can also call the backend execute() method directly in your application code.

Daytona
Modal
Runloop
AgentCore
LangSmith

pip install langchain-daytona

from daytona import Daytona

from langchain_daytona import DaytonaSandbox

sandbox = Daytona().create()
backend = DaytonaSandbox(sandbox=sandbox)

result = backend.execute("python --version")
print(result.output)

import modal

from langchain_modal import ModalSandbox

app = modal.App.lookup("your-app")
modal_sandbox = modal.Sandbox.create(app=app)
backend = ModalSandbox(sandbox=modal_sandbox)

result = backend.execute("python --version")
print(result.output)

pip install langchain-runloop

from runloop_api_client import RunloopSDK

from langchain_runloop import RunloopSandbox

api_key = "..."
client = RunloopSDK(bearer_token=api_key)

devbox = client.devbox.create()
backend = RunloopSandbox(devbox=devbox)

try:
    result = backend.execute("python --version")
    print(result.output)
finally:
    devbox.shutdown()

pip install langchain-agentcore-codeinterpreter

from bedrock_agentcore.tools.code_interpreter_client import CodeInterpreter

from langchain_agentcore_codeinterpreter import AgentCoreSandbox

interpreter = CodeInterpreter(region="us-west-2")
interpreter.start()

backend = AgentCoreSandbox(interpreter=interpreter)

try:
    result = backend.execute("python3 --version")
    print(result.output)
finally:
    interpreter.stop()

from langsmith.sandbox import SandboxClient

from deepagents.backends.langsmith import LangSmithSandbox

client = SandboxClient()
ls_sandbox = client.create_sandbox(template_name="deepagents-deploy")
backend = LangSmithSandbox(sandbox=ls_sandbox)

result = backend.execute("python --version")
print(result.output)

For example:

4
[Command succeeded with exit code 0]

bash: foobar: command not found
[Command failed with exit code 127]

If a command produces very large output, the result is automatically saved to a file and the agent is instructed to use read_file to access it incrementally. This prevents context window overflow.

Two planes of file access

There are two distinct ways files move in and out of a sandbox, and it’s important to understand when to use each: Agent filesystem tools: read_file, write_file, edit_file, ls, glob, grep, and execute are the tools the LLM calls during its execution. These go through execute() inside the sandbox. The agent uses them to read code, write files, and run commands as part of its task. File transfer APIs: the uploadFiles() and downloadFiles() methods that your application code calls. These use the provider’s native file transfer APIs (not shell commands) and are designed for moving files between your host environment and the sandbox. Use these to:

Seed the sandbox with source code, configuration, or data before the agent runs
Retrieve artifacts (generated code, build outputs, reports) after the agent finishes
Pre-populate dependencies that the agent will need

使用文件

The deepagents sandbox backends support file transfer APIs for moving files between your application and the sandbox.

Seeding the sandbox

Use upload_files() to populate the sandbox before the agent runs. Paths must be absolute and contents are bytes:

Daytona
Modal
Runloop
AgentCore
LangSmith

pip install langchain-daytona

from daytona import Daytona

from langchain_daytona import DaytonaSandbox

sandbox = Daytona().create()
backend = DaytonaSandbox(sandbox=sandbox)

backend.upload_files(
    [
        ("/src/index.py", b"print('Hello')\n"),
        ("/pyproject.toml", b"[project]\nname = 'my-app'\n"),
    ]
)

import modal

from langchain_modal import ModalSandbox

app = modal.App.lookup("your-app")
modal_sandbox = modal.Sandbox.create(app=app)
backend = ModalSandbox(sandbox=modal_sandbox)

backend.upload_files(
    [
        ("/src/index.py", b"print('Hello')\n"),
        ("/pyproject.toml", b"[project]\nname = 'my-app'\n"),
    ]
)

pip install langchain-runloop

from runloop_api_client import RunloopSDK

from langchain_runloop import RunloopSandbox

api_key = "..."
client = RunloopSDK(bearer_token=api_key)

devbox = client.devbox.create()
backend = RunloopSandbox(devbox=devbox)

backend.upload_files(
    [
        ("/src/index.py", b"print('Hello')\n"),
        ("/pyproject.toml", b"[project]\nname = 'my-app'\n"),
    ]
)

pip install langchain-agentcore-codeinterpreter

from bedrock_agentcore.tools.code_interpreter_client import CodeInterpreter

from langchain_agentcore_codeinterpreter import AgentCoreSandbox

interpreter = CodeInterpreter(region="us-west-2")
interpreter.start()

backend = AgentCoreSandbox(interpreter=interpreter)

backend.upload_files(
    [
        ("hello.py", b"print('Hello')\n"),
        ("data.csv", b"name,value\na,1\nb,2\n"),
    ]
)

from langsmith.sandbox import SandboxClient

from deepagents.backends.langsmith import LangSmithSandbox

client = SandboxClient()
ls_sandbox = client.create_sandbox(template_name="deepagents-deploy")
backend = LangSmithSandbox(sandbox=ls_sandbox)

backend.upload_files(
    [
        ("/src/index.py", b"print('Hello')\n"),
        ("/pyproject.toml", b"[project]\nname = 'my-app'\n"),
    ]
)

Retrieving artifacts

Use download_files() to retrieve files from the sandbox after the agent finishes:

Daytona
Modal
Runloop
AgentCore
LangSmith

pip install langchain-daytona

from daytona import Daytona

from langchain_daytona import DaytonaSandbox

sandbox = Daytona().create()
backend = DaytonaSandbox(sandbox=sandbox)

results = backend.download_files(["/src/index.py", "/output.txt"])
for result in results:
    if result.content is not None:
        print(f"{result.path}: {result.content.decode()}")
    else:
        print(f"Failed to download {result.path}: {result.error}")

import modal

from langchain_modal import ModalSandbox

app = modal.App.lookup("your-app")
modal_sandbox = modal.Sandbox.create(app=app)
backend = ModalSandbox(sandbox=modal_sandbox)

results = backend.download_files(["/src/index.py", "/output.txt"])
for result in results:
    if result.content is not None:
        print(f"{result.path}: {result.content.decode()}")
    else:
        print(f"Failed to download {result.path}: {result.error}")

pip install langchain-runloop

from runloop_api_client import RunloopSDK

from langchain_runloop import RunloopSandbox

api_key = "..."
client = RunloopSDK(bearer_token=api_key)

devbox = client.devbox.create()
backend = RunloopSandbox(devbox=devbox)

results = backend.download_files(["/src/index.py", "/output.txt"])
for result in results:
    if result.content is not None:
        print(f"{result.path}: {result.content.decode()}")
    else:
        print(f"Failed to download {result.path}: {result.error}")

pip install langchain-agentcore-codeinterpreter

from bedrock_agentcore.tools.code_interpreter_client import CodeInterpreter

from langchain_agentcore_codeinterpreter import AgentCoreSandbox

interpreter = CodeInterpreter(region="us-west-2")
interpreter.start()

backend = AgentCoreSandbox(interpreter=interpreter)

results = backend.download_files(["hello.py"])
for result in results:
    if result.content is not None:
        print(f"{result.path}: {result.content.decode()}")
    else:
        print(f"Failed to download {result.path}: {result.error}")

interpreter.stop()

from langsmith.sandbox import SandboxClient

from deepagents.backends.langsmith import LangSmithSandbox

client = SandboxClient()
ls_sandbox = client.create_sandbox(template_name="deepagents-deploy")
backend = LangSmithSandbox(sandbox=ls_sandbox)

results = backend.download_files(["/src/index.py", "/output.txt"])
for result in results:
    if result.content is not None:
        print(f"{result.path}: {result.content.decode()}")
    else:
        print(f"Failed to download {result.path}: {result.error}")

Inside the sandbox, the agent uses filesystem tools (read_file, write_file). The upload_files and download_files methods are for your application code to move files across the boundary between your host and the sandbox.

安全注意事项

Sandboxes isolate code execution from your host system, but they don’t protect against context injection. An attacker who controls part of the agent’s input can instruct it to read files, run commands, or exfiltrate data from within the sandbox. This makes credentials inside the sandbox especially dangerous.

Never put secrets inside a sandbox. API keys, tokens, database credentials, and other secrets injected into a sandbox (via environment variables, mounted files, or the secrets option) can be read and exfiltrated by a context-injected agent. This applies even to short-lived or scoped credentials—if an agent can access them, so can an attacker.

Handling secrets safely

If your agent needs to call authenticated APIs or access protected resources, you have two options:

Keep secrets in tools outside the sandbox. Define tools that run in your host environment (not inside the sandbox) and handle authentication there. The agent calls these tools by name, but never sees the credentials. This is the recommended approach.
Use a network proxy that injects credentials. Some sandbox providers support proxies that intercept outgoing HTTP requests from the sandbox and attach credentials (e.g., Authorization headers) before forwarding them. The agent never sees the secret—it just makes plain requests to a URL. This approach is not yet widely available across providers.

If you must inject secrets into a sandbox (not recommended), take these precautions:

Enable human-in-the-loop approval for all tool calls, not just sensitive ones
Block or restrict network access from the sandbox to limit exfiltration paths
Use the narrowest possible credential scope and shortest possible lifetime
Monitor sandbox network traffic for unexpected outbound requests

Even with these safeguards, this remains an unsafe workaround. A sufficiently creative enough context injection attack can bypass output filtering and HITL review.

General best practices

Review sandbox outputs before acting on them in your application
Block sandbox network access when not needed
Use middleware to filter or redact sensitive patterns in tool outputs
Treat everything produced inside the sandbox as untrusted input

连接这些文档到 Claude、VSCode 等工具，通过 MCP 获取实时答案。

Edit this page on GitHub or file an issue.

Documentation Index

​为什么使用沙箱？

​基本用法

​可用提供商

​生命周期和作用域

​Thread-scoped (default)

​Assistant-scoped

​Integration patterns

​Agent in sandbox pattern

​沙箱即工具模式

​How sandboxes work

​Isolation boundaries

​The execute method

​Two planes of file access

​使用文件

​Seeding the sandbox

​Retrieving artifacts

​安全注意事项

​Handling secrets safely

​General best practices

为什么使用沙箱？

基本用法

可用提供商

生命周期和作用域

Thread-scoped (default)

Assistant-scoped

Integration patterns

Agent in sandbox pattern

沙箱即工具模式

How sandboxes work

Isolation boundaries

The `execute` method

Two planes of file access

使用文件

Seeding the sandbox

Retrieving artifacts

安全注意事项

Handling secrets safely

General best practices