属性错误:“文档”对象没有属性“get_doc_id”

问题描述 投票:0回答:1

我的应用程序:将CSV文件加载到知识图中(使用KnowledgeGraphIndex)并使用LLM(HuggingFaceH4/zephyr-7b-beta)从图形存储(SimpleGraphStore)中检索答案。

我的问题:我想将多个 CSV 文件传递到知识图谱中, 我正在使用 CSVLoader ,当我运行 KnowledgeGraphIndex 时, 我收到此错误:AttributeError:“文档”对象没有属性“get_doc_id”

这就是我加载 CSV 的方式:

`from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()

splitter = CharacterTextSplitter(separator = "\n",
                                chunk_size=500, 
                                chunk_overlap=0,
                                length_function=len)
documents = splitter.split_documents(data)`

这是我的知识图索引:

`index = KnowledgeGraphIndex.from_documents(
   documents,
 storage_context=storage_context,
   include_embeddings=True,
   max_triplets_per_chunk=2,
   embed_model=embed_model,

)``
python csv large-language-model knowledge-graph graph-store-protocol
1个回答
0
投票

尝试使用

Document
中的
langchain.docstore
,并为
doc_id
分配唯一的 ID:

from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document

csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()

splitter = CharacterTextSplitter(separator="\n",
                                 chunk_size=500,
                                 chunk_overlap=0,
                                 length_function=len)
documents = splitter.split_documents(data)

# Assign a unique identifier to each document
for i, doc in enumerate(documents):
    doc.metadata["doc_id"] = f"doc_{i}"

index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    include_embeddings=True,
    max_triplets_per_chunk=2,
    embed_model=embed_model,
)
© www.soinside.com 2019 - 2024. All rights reserved.