Azure AI 搜索无法读取搜索索引中配置的 Blob 元数据

问题描述 投票:0回答:1

我已将不到 20 个 HTML 文档上传到 Azure 存储帐户中的 Blob 容器。每个文件有两个标签:

source_url
document_type

我已导入数据并对其进行矢量化(使用“概述”边栏选项卡中的正确向导),从而创建了 Azure AI 搜索数据源、索引和索引器。

我更新了索引定义,添加了两个具有完全相同标签名称的可检索和可搜索字段。它遵循从 Azure 门户获取的定义:

{
  "name": "source_url",
  "type": "Edm.String",
  "searchable": true,
  "filterable": false,
  "retrievable": true,
  "sortable": false,
  "facetable": false,
  "key": false,
  "indexAnalyzer": null,
  "searchAnalyzer": null,
  "analyzer": "standard.lucene",
  "normalizer": null,
  "dimensions": null,
  "vectorSearchProfile": null,
  "synonymMaps": []
},
{
  "name": "document_type",
  "type": "Edm.String",
  "searchable": true,
  "filterable": false,
  "retrievable": true,
  "sortable": true,
  "facetable": false,
  "key": false,
  "indexAnalyzer": null,
  "searchAnalyzer": null,
  "analyzer": "standard.lucene",
  "normalizer": null,
  "dimensions": null,
  "vectorSearchProfile": null,
  "synonymMaps": []
}

当我尝试使用已知文档标题进行搜索时(索引重新生成后),我可以看到两个字段始终为空:

{
  "@odata.context": "https://reg-srch-eu-dev.search.windows.net/indexes('vector-docs-json-2')/$metadata#docs(*)",
  "@search.answers": [],
  "value": [
    {
      "@search.score": 0.016393441706895828,
      "@search.rerankerScore": 2.7293198108673096,
      "@search.captions": [
        {
          "text": "hua da trading, inc. - 664359 - 12_20_2023 _ fda.json. that...",
          "highlights": "<em>hua da trading, inc. - 664359</em> - 12_20_2023 _ fda.json. that..."
        }
      ],
      "chunk_id": "436dc57017d5_aHR0cHM6Ly9yZWdzYWV1ZGV2LmJsb2IuY29yZS53aW5kb3dzLm5ldC9kb2NzLWpzb24vZmRhL0h1YSUyMERhJTIwVHJhZGluZywlMjBJbmMuJTIwLSUyMDY2NDM1OSUyMC0lMjAxMl8yMF8yMDIzJTIwXyUyMEZEQS5qc29u0_pages_8",
      "parent_id": "aHR0cHM6Ly9yZWdzYWV1ZGV2LmJsb2IuY29yZS53aW5kb3dzLm5ldC9kb2NzLWpzb24vZmRhL0h1YSUyMERhJTIwVHJhZGluZywlMjBJbmMuJTIwLSUyMDY2NDM1OSUyMC0lMjAxMl8yMF8yMDIzJTIwXyUyMEZEQS5qc29u0",
      "chunk": "that you recalled 300 boxes of your “WeFun,” lot numbers 18520168 and 09/30/2026, due to presence of undeclared sildenafil in August 2023...",
      "title": "Hua Da Trading, Inc. - 664359 - 12_20_2023 _ FDA.json",
      "source_url": null,
      "document_type": null
    },
    //...
}

我已在 Azure OpenAI Playground 中配置此 Azure AI 搜索索引,并将源 URL 设置为索引字段。聊天能够按日期提取文档(并回复其他问题),但无法提供源 URL:

所请求的信息在检索到的数据中不可用。请尝试其他查询或主题。

我想知道我错过了什么。我该怎么做才能使 Blob 标签正确映射到搜索索引中?

azure-cognitive-search azure-openai
1个回答
0
投票

Microsoft 文档 所示,仅映射元数据,而不映射标签:

目前,此索引器不支持索引 blob 索引标记。

© www.soinside.com 2019 - 2024. All rights reserved.