我想从租赁协议中提取关键条款。
为此,我想将合同的 PDF 发送到 AI 服务,该服务必须以 JSON 格式返回一些关键条款。
有哪些不同的图书馆和公司可以做到这一点?到目前为止,我已经探索了 OpenAI API,但它并不像我想象的那么简单。
使用ChatGPT接口时,它运行得很好,所以我认为使用API应该同样简单。
看来我需要先阅读PDF文本,然后将文本发送到OpenAI API。
任何其他实现这一目标的想法将不胜感激。
您要使用的是 Assistants API。
截至今天,有 3 个工具可用:
您需要使用知识检索工具。正如官方OpenAI文档中所述:
检索增强了助手的外部知识 模型,例如专有产品信息或提供的文档 由您的用户。文件上传并传递给助手后, OpenAI 将自动对您的文档进行分块、索引并存储 嵌入,并实现矢量搜索来检索相关内容 回答用户的疑问。
我过去构建过一个客户支持聊天机器人。以此为例。就您而言,您希望助手使用您的 PDF 文件(我使用的是
knowledge.txt
文件)。看看我的 GitHub 和 YouTube。
customer_support_chatbot.py
import os
from openai import OpenAI
client = OpenAI()
OpenAI.api_key = os.getenv('OPENAI_API_KEY')
# Step 1: Upload a File with an "assistants" purpose
my_file = client.files.create(
file=open("knowledge.txt", "rb"),
purpose='assistants'
)
print(f"This is the file object: {my_file} \n")
# Step 2: Create an Assistant
my_assistant = client.beta.assistants.create(
model="gpt-3.5-turbo-1106",
instructions="You are a customer support chatbot. Use your knowledge base to best respond to customer queries.",
name="Customer Support Chatbot",
tools=[{"type": "retrieval"}]
)
print(f"This is the assistant object: {my_assistant} \n")
# Step 3: Create a Thread
my_thread = client.beta.threads.create()
print(f"This is the thread object: {my_thread} \n")
# Step 4: Add a Message to a Thread
my_thread_message = client.beta.threads.messages.create(
thread_id=my_thread.id,
role="user",
content="What can I buy in your online store?",
file_ids=[my_file.id]
)
print(f"This is the message object: {my_thread_message} \n")
# Step 5: Run the Assistant
my_run = client.beta.threads.runs.create(
thread_id=my_thread.id,
assistant_id=my_assistant.id,
instructions="Please address the user as Rok Benko."
)
print(f"This is the run object: {my_run} \n")
# Step 6: Periodically retrieve the Run to check on its status to see if it has moved to completed
while my_run.status in ["queued", "in_progress"]:
keep_retrieving_run = client.beta.threads.runs.retrieve(
thread_id=my_thread.id,
run_id=my_run.id
)
print(f"Run status: {keep_retrieving_run.status}")
if keep_retrieving_run.status == "completed":
print("\n")
# Step 7: Retrieve the Messages added by the Assistant to the Thread
all_messages = client.beta.threads.messages.list(
thread_id=my_thread.id
)
print("------------------------------------------------------------ \n")
print(f"User: {my_thread_message.content[0].text.value}")
print(f"Assistant: {all_messages.data[0].content[0].text.value}")
break
elif keep_retrieving_run.status == "queued" or keep_retrieving_run.status == "in_progress":
pass
else:
print(f"Run status: {keep_retrieving_run.status}")
break