我不确定如何估算我正在构建的 RAG 管道中使用 OpenAI 的总成本。我想提前跟踪代币的使用情况和相关成本。以下是一些代码片段:
model_name = 'text-embedding-ada-002'
embeddings = OpenAIEmbeddings(
model=model_name,
openai_api_key=openai_api_key
)
def create_and_load_faiss_index(chunks, embeddings, index_path):
try:
# Create a FAISS index from documents
db = FAISS.from_documents(chunks, embeddings)
# Save the FAISS index locally
db.save_local(index_path)
# Load the FAISS index from the saved location
db = FAISS.load_local(index_path, embeddings)
return db
except Exception as e:
print(f"An error occurred: {str(e)}")
return None
db = create_and_load_faiss_index(chunks, embeddings, index_path)
retriever = db.as_retriever()
template = """…"""
prompt_template = ChatPromptTemplate.from_template(template=template)
print(prompt_template)
llm = ChatOpenAI(model_name="gpt-4", temperature=0)
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt_template
| llm
| StrOutputParser()
)
query = f"…"
openai_output = rag_chain.invoke(query)
with get_openai_callback() as cb:
openai_output = rag_chain.invoke(query)
print(cb)
Tokens Used: 37
Prompt Tokens: 4
Completion Tokens: 33
Successful Requests: 1
Total Cost (USD): $7.2e-05
就我而言,结果不一样。
我无法减少提示令牌(相反,稍微增加了提示令牌和响应时间)。
但嵌入的提示返回了更好的答案。
非嵌入提示
"prompt_tokens": 3295,
"completion_tokens": 347,
"openai_process_time": 4.253575,
嵌入式提示
(回答得更好)
"prompt_tokens": 3602,
"completion_tokens": 686,
"openai_process_time": 8.553565,
非嵌入提示
"prompt_tokens": 3355,
"completion_tokens": 347,
"openai_process_time": 4.67733,
嵌入式提示
(回答得更好)
"prompt_tokens": 3669,
"completion_tokens": 583,
"openai_process_time": 7.52354,