我尝试使用 YoutubeLoader.from_youtube_url 时遇到错误

问题描述 投票:0回答:2

这是我的代码片段

import os,openai

from langchain.document_loaders import YoutubeLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import ChatVectorDBChain,ConversationalRetrievalChain

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate
)
os.environ["OPENAI_API_KEY"] = "apikey"
loader = YoutubeLoader.from_youtube_url(youtube_url="https://www.youtube.com/watch?v=7OPg-ksxZ4Y",add_video_info=True)
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 300,
    chunk_overlap = 20
)

documents = text_splitter.split_documents(documents)
#print(documents)

embeddings = OpenAIEmbeddings()
vector_store = Chroma.from_documents(documents=documents,embedding=embeddings)
retriever = vector_store.as_retriever()

system_template = """
Use the following context to answer the user's question.
If you don't know the answer, say you don't, don't try to make it up. And answer in Chinese.
-----------
{context}
-----------
{chat_history}
"""

messages  =[
    SystemMessagePromptTemplate.from_template(system_template),
    HumanMessagePromptTemplate.from_template('{question}')
]

prompt = ChatPromptTemplate.from_messages(messages)

qa = ConversationalRetrievalChain.from_llm(ChatOpenAI(temperature=0.1,max_tokens=2048),retriever,qa_prompt=prompt)
chat_history = []
while True:
    question = input('问题:')
    result = qa({'question':question,'chat_history':chat_history})
    chat_history.append((question,result['answer']))
    print(result['answer'])

还有错误的详细信息

PS C:\Users\12875\Desktop\新建文件夹> & E:/Program/python/python.exe c:/Users/12875/Desktop/新建文件夹/分析youtube视频.py
Using embedded DuckDB without persistence: data will be transient
Traceback (most recent call last):
  File "c:\Users\12875\Desktop\新建文件夹\分析youtube视频.py", line 30, in <module>
    vector_store = Chroma.from_documents(documents=documents,embedding=embeddings)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Program\python\Lib\site-packages\langchain\vectorstores\chroma.py", line 412, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "E:\Program\python\Lib\site-packages\langchain\vectorstores\chroma.py", line 380, in from_texts
    chroma_collection.add_texts(texts=texts, metadatas=metadatas, ids=ids)
  File "E:\Program\python\Lib\site-packages\langchain\vectorstores\chroma.py", line 159, in add_texts 
    self._collection.add(
  File "E:\Program\python\Lib\site-packages\chromadb\api\models\Collection.py", line 84, in add       
    metadatas = validate_metadatas(maybe_cast_one_to_many(metadatas)) if metadatas else None
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Program\python\Lib\site-packages\chromadb\api\types.py", line 107, in validate_metadatas   
    validate_metadata(metadata)
  File "E:\Program\python\Lib\site-packages\chromadb\api\types.py", line 98, in validate_metadata     
    raise ValueError(f"Expected metadata value to be a str, int, or float, got {value}")
ValueError: Expected metadata value to be a str, int, or float, got None

YoutubeLoader 类最近将其中一种方法从 from_youtube_channel 更新为 from_youtube_url。但是当我使用 from_youtube_url 时,我发生了错误“ValueError: Expectedmetadata value to be a str, int, or float, got None”。我想知道我该怎么办了,谢谢!

python youtube-api openai-api gpt-3 langchain
2个回答
0
投票

改变:

loader = YoutubeLoader.from_youtube_url(youtube_url="https://www.youtube.com/watch?v=7OPg-ksxZ4Y",add_video_info=True)

致:

loader = YoutubeLoader.from_youtube_url(youtube_url="https://www.youtube.com/watch?v=7OPg-ksxZ4Y",add_video_info=False)

0
投票

我尝试更改“add_video_info=True”→“add_video_info=False”。 但是,它不起作用。 因此,我将选项值返回为 True。并且它可以工作一次。

但是,它又不起作用了。有趣!!!

© www.soinside.com 2019 - 2024. All rights reserved.