使用现有图形为多个标签创建矢量存储

问题描述 投票:0回答:1

我正在尝试使用 from_existing_graph 在我现有的 KG 之上创建向量存储,(遵循 tomaz 和 Saurav Joshi neo4j 博客文章) - 这种方法允许我仅为单个标签创建嵌入/向量索引,因为我无法获得所需的结果询问 NLQ 时的结果(不过我假设)。

下面的代码可以回答Oliver的年龄和位置,但不能回答他的指示, 我相信这是因为 from_existing_graph 只需要传递单个标签及其相应的属性作为生成嵌入和向量索引的选项 有什么想法,如何实现这一目标?

import os
import re
from langchain.vectorstores.neo4j_vector import Neo4jVector
# from langchain.document_loaders import WikipediaLoader
from langchain_openai import OpenAIEmbeddings
# from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.graphs import Neo4jGraph
import openai
# from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

os.environ["OPENAI_API_KEY"] = "sk-xx"
url = "neo4j+s://xxxx.databases.neo4j.io"
username = "neo4j"
password = "mypassword"
existing_graph = Neo4jVector.from_existing_graph(
    embedding=OpenAIEmbeddings(),
    url=url,
    username=username,
    password=password,
    index_name="person",
    node_label="Person",
    text_node_properties=["name", "age", "location"],
    embedding_node_property="embedding",
)

from langchain.chat_models import ChatOpenAI
from langchain.chains import GraphCypherQAChain
from langchain.graphs import Neo4jGraph

graph = Neo4jGraph(
    url=url, username=username, password=password
)

chain = GraphCypherQAChain.from_llm(
    ChatOpenAI(temperature=0), graph=graph, verbose=True
)

query = "Where does Oliver Stone live?"
#query = "Name some films directed by Oliver Stone?" 

graph_result = chain.invoke(query)

vector_results = existing_graph.similarity_search(query, k=1)
for i, res in enumerate(vector_results):
    print(res.page_content)
    if i != len(vector_results)-1:
        print()
vector_result = vector_results[0].page_content

# Construct prompt for OpenAI
final_prompt = f"""You are a helpful question-answering agent. Your task is to analyze
and synthesize information from two sources: the top result from a similarity search
(unstructured information) and relevant data from a graph database (structured information).
Given the user's query: {query}, provide a meaningful and efficient answer based
on the insights derived from the following data:

Unstructured information: {vector_result}.
Structured information: {graph_result} """


from openai import OpenAI
client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

chat_completion = client.chat.completions.create(messages=[{"role": "user","content": final_prompt,  }],model="gpt-3.5-turbo",)

answer = chat_completion.choices[0].message.content.strip()
print(answer)

如有任何帮助,我们将不胜感激?

这是我的架构: 节点属性如下:

Person {name: STRING, embedding: LIST, age: INTEGER, location: STRING},Actor {name: STRING, embedding: LIST},Movie {title: STRING},Director {name: STRING, embedding: LIST, age: INTEGER, location: STRING}
Relationship properties are the following:
ACTED_IN {role: STRING}
The relationships are the following:
(:Person)-[:ACTED_IN]->(:Movie),(:Person)-[:DIRECTED]->(:Movie),(:Actor)-[:ACTED_IN]->(:Movie),(:Director)-[:DIRECTED]->(:Movie)

Cypher 用于创建:

CREATE (charlie:Person:Actor {name: 'Charlie Sheen'})-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet:Movie {title: 'Wall Street'})<-[:DIRECTED]-(oliver:Person:Director {name: 'Oliver Stone'});
MATCH (n:Person {name: 'Oliver Stone'}) SET n.age = 30, n.location = "New York" RETURN n
neo4j openai-api langchain large-language-model
1个回答
0
投票

以下是更新后的代码。您需要将关系 :DIRECTED 添加到索引 person_index 中,因为他导演的电影不是嵌入的一部分。一旦您有添加他导演的电影的查询,您就可以将其添加到结果节点元数据上。然后在您的矢量结果上,您将添加元数据电影[“标题”]。

如果图表中有多个电影标题,您可能需要收集所有电影标题。

参考:https://github.com/tomasonjo/blogs/blob/master/llm/neo4jvector_langchain_deepdive.ipynb

import os
from langchain.vectorstores.neo4j_vector import Neo4jVector
from langchain_openai import OpenAIEmbeddings
import openai

os.environ["OPENAI_API_KEY"] = "sk-<key>"
url = "bolt://localhost:7687"
username = "neo4j"
password = "awesome_password"

retrieval_query = """
       MATCH (node)-[:DIRECTED]->(m:Movie)
       WITH node, score, collect(m) as movies
       RETURN node.name as text, score, node{.*, embedding: Null, movies: movies} as metadata
       """

existing_index_return = Neo4jVector.from_existing_index(
    embedding=OpenAIEmbeddings(),
    url=url,
    username=username,
    password=password,
    database="neo4j",
    index_name="person_index",
    text_node_property="name",
    retrieval_query=retrieval_query,
)

from langchain_openai import ChatOpenAI
from langchain.chains import GraphCypherQAChain
from langchain_community.graphs import Neo4jGraph

graph = Neo4jGraph(
    url=url, username=username, password=password
)

chain = GraphCypherQAChain.from_llm(
    ChatOpenAI(temperature=0), graph=graph, verbose=True
)

#query = "Where does Oliver Stone live?"
query = "Name some films directed by Oliver Stone?" 

graph_result = chain.invoke(query)

vector_results = existing_index_return.similarity_search(query, k=1)
vector_result = vector_results[0].page_content + " lives in " + vector_results[0].metadata["location"] + " and he directed the movie " + vector_results[0].metadata["movies"][0]["title"]

# Construct prompt for OpenAI
final_prompt = f"""You are a helpful question-answering agent. Your task is to analyze
and synthesize information from two sources: the top result from a similarity search
(unstructured information) and relevant data from a graph database (structured information).
Given the user's query: {query}, provide a meaningful and efficient answer based
on the insights derived from the following data:

Unstructured information: {vector_result}.
Structured information: {graph_result} """


from openai import OpenAI
client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

chat_completion = client.chat.completions.create(messages=[{"role": "user","content": final_prompt,  }],model="gpt-3.5-turbo",)

answer = chat_completion.choices[0].message.content.strip()
print(answer)

输出示例:

> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (d:Director {name: "Oliver Stone"})-[:DIRECTED]->(m:Movie)
RETURN m.title
Full Context:
[{'m.title': 'Wall Street'}]

> Finished chain.
Based on the unstructured information retrieved from the top result of the search, Oliver Stone directed the film "Wall Street." In addition to "Wall Street," some other films directed by Oliver Stone include "Platoon," "JFK," "Born on the Fourth of July," "Natural Born Killers," and "Snowden."
© www.soinside.com 2019 - 2024. All rights reserved.