如何对langchain的MongoDBAtlasVectorSearch“similarity_search_with_score”应用过滤器？

Question

我正在使用 MongoDBAtlasVectorSearch 并且 ì 想要搜索最相似的文档，因此我使用函数 similarity_search_with_score。

但是，我似乎无法在此相似性_search_with_score 函数中添加过滤器。

这是我的代码：

vector_search = MongoDBAtlasVectorSearch(
        collection=client[os.getenv("MONGODB_DB")]["files"],
        embedding=embeddings,
        index_name=os.getenv("ATLAS_VECTOR_SEARCH_INDEX_NAME"),
    )

results = vector_search.similarity_search_with_score(
        query="What are the engagements of the company",
        k=5,
        pre_filter={
            "compound": {
                "filter": [
                    {"equals": {"path": "uploaded_by", "value": chat_owner}},
                    {"in": {"path": "file_name", "values": file_names}},
                ]
            }
        },
    )

这是我的索引：

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "embedding": {
        "dimensions": 1536,
        "similarity": "cosine",
        "type": "knnVector"
      },
      "file_name": {
        "normalizer": "lowercase",
        "type": "token"
      },
      "uploaded_by": {
        "normalizer": "lowercase",
        "type": "token"
      }
    }
  }
}

但是，这给了我以下错误：

pymongo.errors.OperationFailure: "knnBeta.filter.compound.filter[1].in.value" is required, full error: {'ok': 0.0, 'errmsg': '"knnBeta.filter.compound.filter[1].in.value" is required', 'code': 8, 'codeName': 'UnknownError', '$clusterTime': {'clusterTime': Timestamp(1704804627, 1), 'signature': {'hash': b'\xfa\x15s+Q\x1d\xa86]R\xb2!\x9d\xc5b-G\xce\xa6S', 'keyId': 7283272637088792583}}, 'operationTime': Timestamp(1704804627, 1)}

我也这样尝试过：

        pre_filter={
            "$and": [
                {"uploaded_by": {"$eq": chat_owner}},
                {"file_name": {"$in": file_names}},
            ]
        },

但是我收到了这个错误：

pymongo.errors.OperationFailure: "knnBeta.filter" one of [autocomplete, compound, embeddedDocument, equals, exists, geoShape, geoWithin, in, knnBeta, moreLikeThis, near, phrase, queryString, range, regex, search, span, term, text, wildcard] must be present, full error: {'ok': 0.0, 'errmsg': '"knnBeta.filter" one of [autocomplete, compound, embeddedDocument, equals, exists, geoShape, geoWithin, in, knnBeta, moreLikeThis, near, phrase, queryString, range, regex, search, span, term, text, wildcard] must be present', 'code': 8, 'codeName': 'UnknownError', '$clusterTime': {'clusterTime': Timestamp(1704802325, 9), 'signature': {'hash': b'`\xd27-\x81+\x16\xd0a\x14\xc7\x99\xa8\x05|Sx?\x0e:', 'keyId': 7283272637088792583}}, 'operationTime': Timestamp(1704802325, 9)}
WARNING:  StatReload detected changes in 'src/routes/chats/chats.py'. Reloading...

如何正确使用similarity_search_with_score中的过滤器？

Answer 1

查看您的错误消息

'“knnBeta.filter.compound.filter1.in.value”是必需的'

并且基于 MongoDB 论坛中的这个答案看起来您的 in

 子句正在使用

values

 而不是

value

。举个例子：

"in": {
      "path": "fileName",
      "value": model_documents,
}

如何对langchain的MongoDBAtlasVectorSearch“similarity_search_with_score”应用过滤器？

问题描述投票：0回答：1

1个回答

最新问题

如何对langchain的MongoDBAtlasVectorSearch“similarity_search_with_score”应用过滤器？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1