UnstructuredXMLLoader 集成 - Docs by LangChain

This guide provides a quick overview for getting started with UnstructuredXMLLoader document loader. The UnstructuredXMLLoader is used to load XML files. The loader works with .xml files. The page content will be the text extracted from the XML tags.

概述

集成详情

Class	Package	Local	Serializable	JS support
`UnstructuredXMLLoader`	`langchain_community`	✅	❌	✅

加载器特性

Source	Document Lazy Loading	Native Async Support
`UnstructuredXMLLoader`	✅	❌

设置

To access UnstructuredXMLLoader document loader you’ll need to install the langchain-community integration package.

凭证

No credentials are needed to use the UnstructuredXMLLoader 要启用模型调用的自动追踪，请设置你的 LangSmith API 密钥：

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

安装

安装 langchain_community。

pip install -qU langchain_community

初始化

现在我们可以实例化模型对象并加载文档：

from langchain_community.document_loaders import UnstructuredXMLLoader

loader = UnstructuredXMLLoader(
    "./example_data/factbook.xml",
)

加载

docs = loader.load()
docs[0]

Document(metadata={'source': './example_data/factbook.xml'}, page_content='United States\n\nWashington, DC\n\nJoe Biden\n\nBaseball\n\nCanada\n\nOttawa\n\nJustin Trudeau\n\nHockey\n\nFrance\n\nParis\n\nEmmanuel Macron\n\nSoccer\n\nTrinidad & Tobado\n\nPort of Spain\n\nKeith Rowley\n\nTrack & Field')

print(docs[0].metadata)

{'source': './example_data/factbook.xml'}

惰性加载

page = []
for doc in loader.lazy_load():
    page.append(doc)
    if len(page) >= 10:
        # do some paged operation, e.g.
        # index.upsert(page)

        page = []

API 参考

For detailed documentation of all UnstructuredXMLLoader features and configurations head to the API reference

通过 MCP 将这些文档连接到 Claude、VSCode 等工具以获取实时答案。

在 GitHub 上编辑此页面或提交 issue。

Documentation Index

​概述

​集成详情

​加载器特性

​设置

​凭证

​安装

​初始化

​加载

​惰性加载

​API 参考

概述

集成详情

加载器特性

设置

凭证

安装

初始化

加载

惰性加载

API 参考