无法从“azure.search.documents.indexes.models”导入名称“ExhaustiveKnnAlgorithmConfiguration”

问题描述 投票:0回答:1

我需要一些帮助。我一整天都在尝试在升级的 langchain 版本下嵌入文档(使用 text-embedding-3-large 模型嵌入)。我无法解决上述问题。我已经尝试了网上看到的所有方法,降级 azure-search-documents、降级 langchain 等等。我收到该错误或另一个错误:

(InvalidRequestParameter) 请求无效。详细信息:定义:矢量字段“content_vector”必须设置属性“vectorSearchProfile”。

代码:无效请求参数

你解决了吗?

这是我当前的设置:

azure-core==1.29.7

azure-搜索文档==11.4.0b8

langchain==0.1.8(0.2.0 也失败)

langchain-core==0.2.1

这是我的代码:

def set_vector_fields():
    return [
                SimpleField(name="id",type=SearchFieldDataType.String,key=True,filterable=True,),
                SearchableField(name="content",type=SearchFieldDataType.String,searchable=True,),
                SearchField(name="content_vector",
                    type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                    vector_search_dimensions=dimensionality,
                    searchable=True,
                    vector_search_configuration="hnsw_config"
                    #vector_search_profile_name = "profile_hnsw_config",
                    #vectorSearchProfile="profile_hnsw_config"

                ),
                
                # Additional fields for metadata. Customize as needed based on the structure of the data. See additional footnotes for details
                SearchableField(name="metadata", type=SearchFieldDataType.String, searchable=True,filterable=True,),
                SearchableField(name="id_embedding",type=SearchFieldDataType.String, searchable=True,filterable=True,),
                SimpleField(name="last_update",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,),
                SearchableField(name="chunk_no",type=SearchFieldDataType.Double,searchable=True,filterable=True,),
                
                #below lists the additional Metadata Fields addded from the CSV file
                SearchableField(name="Filename",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Subject",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Year",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Source",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                
                #Date Fields
                SimpleField(name="Date_File",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,),
                SimpleField(name="Last_Update_Embedding",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,)
            ]
#def set_vector_search_config_new():
#    return VectorSearch(algorithms=[HnswAlgorithmConfiguration(
#                                                name="hnsw_config",
#                                                kind=VectorSearchAlgorithmKind.HNSW,
#                                                parameters=HnswParameters(m=8, metric="cosine", ef_construction=400, ef_search=500))],
#                        profiles=VectorSearchProfile(name="profile_hnsw_config",algorithm_configuration_name ="hnsw_config" )
#    )

def set_vector_search_config():
    return VectorSearch(algorithm_configurations=[HnswVectorSearchAlgorithmConfiguration(
                                                name="hnsw_config",
                                                kind="hnsw",
                                                parameters=HnswParameters(m=8, metric="cosine", ef_construction=400, ef_search=500))],
                                                        )

当我尝试调用矢量存储(在 Azure AI 搜索中)时,例程失败

def set_vectorstore(index_name):
            
    embeddings,embedding_function = set_embedding_function()

    fields= set_vector_fields()

    sc_name,scoring_profile =  define_scoring_profile()
    
    # NOTE: IF FAILS HERE, WHEN IT ATTEMPTS TO BUILD THE VECTOR. 
    # I can create the vector, but I can't upload documents via vectorstore.add_documents

    vectorstore: AzureSearch = AzureSearch(
        azure_search_endpoint=azure_search_endpoint,
        azure_search_key=azure_search_key,
        index_name=index_name,
        embedding_function=embedding_function,
        search_type=search_type_GPT,
        fields=fields,
        scoring_profiles = scoring_profile,
        default_scoring_profile = sc_name,

    )
      
    
    return vectorstore

我将不胜感激任何帮助。

谢谢

尝试了网上找到的所有解决方案。

langchain azure-ai-search
1个回答
0
投票

您只需在字段中提供矢量配置文件,并使用配置文件、算法和矢量化器创建矢量搜索配置。

使用下面的代码获取索引。

def set_vector_fields():
    return [
                SimpleField(name="id",type=SearchFieldDataType.String,key=True,filterable=True,),
                SearchableField(name="content",type=SearchFieldDataType.String,searchable=True,),
                SearchField(name="content_vector",
                    type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                    vector_search_dimensions=1536, # 1 - 3072
                    vector_search_profile_name="profile_hnsw_config"
                ),
            
                SearchableField(name="metadata", type=SearchFieldDataType.String, searchable=True,filterable=True,),
                SearchableField(name="id_embedding",type=SearchFieldDataType.String, searchable=True,filterable=True,),
                SimpleField(name="last_update",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,),
                SearchableField(name="chunk_no",type=SearchFieldDataType.Double,searchable=True,filterable=True,),
                
                # Below lists the additional Metadata Fields added from the CSV file
                SearchableField(name="Filename",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Subject",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Year",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                SearchableField(name="Source",type=SearchFieldDataType.String,searchable=True,filterable=True,),
                
                # Date Fields
                SimpleField(name="Date_File",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,),
                SimpleField(name="Last_Update_Embedding",type=SearchFieldDataType.DateTimeOffset,searchable=True,filterable=True,)     ]

在这里,您给出的尺寸范围为 1-3072,因为您使用的是

text-embedding-3-large
。请参阅了解更多信息。

对于矢量搜索配置,请使用以下代码和矢量化器。

def set_vector_search_config_new():
   return VectorSearch(
        algorithms=[
           HnswAlgorithmConfiguration(
                        name="hnsw_config",
                        kind=VectorSearchAlgorithmKind.HNSW,
                        parameters=HnswParameters(m=8, metric="cosine", ef_construction=400, ef_search=500))],

        profiles=[
           VectorSearchProfile(name="profile_hnsw_config",algorithm_configuration_name ="hnsw_config" ,vectorizer="myOpenAI")],
        vectorizers=[
           AzureOpenAIVectorizer(  
            name="myOpenAI",  
            kind="azureOpenAI",  
            azure_open_ai_parameters=AzureOpenAIParameters(  
                resource_uri=azure_openai_endpoint,  
                deployment_id=azure_openai_embedding_deployment,  
                api_key=azure_openai_key,  
            ),  
        ),  
    ]
   )

然后使用下面的代码创建索引。

index_client = SearchIndexClient(endpoint=service_endpoint, credential=credential)  
index = SearchIndex(name=index_name, fields=set_vector_fields(), vector_search=set_vector_search_config_new()) 
result = index_client.create_or_update_index(index) 

输出:

enter image description here

请参阅此文档了解更多信息。

错误

ExhaustiveKnnAlgorithmConfiguration' from 'azure.search.documents.indexes.models
是由于软件包问题造成的。尝试将
azure-search-documents
软件包更新为
11.6.0b1

© www.soinside.com 2019 - 2024. All rights reserved.