如何在分块的 Azure AI 搜索索引中返回图像

问题描述 投票:0回答:2

如标题。

我使用“导入和矢量化数据”来创建索引,并且索引自动分块。

索引架构如;

 "value": [
    {
      "@search.score": 
      "chunk_id": "",
      "chunk": "",
      "title": "",
      "image": ""
    },

参考官方文档,我使用“/document/normalized_images/*/data”检索归一化图像的base64数据,然后使用程序对其进行处理,将其转换为图像文件。然而,我的目标是获取每个块对应的base64数据。因此,我将技能组修改如下,但结果出现错误消息:

“一个或多个索引投影选择器无效。详细信息:索引“名称”中的输入“图像”没有匹配的索引字段。”

"indexProjections": {
    "selectors": [
      {
        "targetIndexName": "name",
        "parentKeyFieldName": "parent_id",
        "sourceContext": "/document/pages/*",
        "mappings": [
          {
            "name": "chunk",
            "source": "/document/pages/*",
            "sourceContext": null,
            "inputs": []
          },
          {
            "name": "vector",
            "source": "/document/pages/*/vector",
            "sourceContext": null,
            "inputs": []
          },
          {
            "name": "title",
            "source": "/document/metadata_storage_name",
            "sourceContext": null,
            "inputs": []
          },
          {
            "name": "image",
            "sourceContext":"/document/pages/*",
            "inputs": [
                            {
                                "source":"/document/normalized_images/*/pages/data",
                                "name":"imagedata"
                            }
                        ]
        
          }
        ]
      }
    ]

我想获取每个索引块文本对应的base64数据。我如何适应这种方法或探索替代解决方案?

azure azure-cognitive-services azure-cognitive-search azure-ai-search
2个回答
1
投票

我想获取每个索引块文本对应的base64数据。

索引架构和技能组配置之间不匹配。用于存储图像 URL 的名为

"image"
的字段似乎不适合存储 base64 数据。

  • 如果你想在索引中直接存储base64数据,你需要在索引模式中添加一个字段来容纳这些数据。您可以将其命名为
    "imageData"
    ,如下所示。
"fields": [
    { "name": "imageData", "type": "Edm.String", "filterable": false, "sortable": false, "facetable": false, "searchable": false }
]

修改上述内容后,只需更新技能组,如下所示。

"skills": [
    {
        "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
        "name": "#1",
        "inputs": [
            {
                "name": "chunk",
                "source": "/document/pages/*"
            }
        ],
        "outputs": [
            {
                "name": "chunk"
            },
            {
                "name": "imageData",
                "targetName": "imageData"
            }
        ]
    },
    {
        "@odata.type": "#Microsoft.Skills.Text.ExtractKeyPhrasesSkill",
        "name": "#2",
        "context": "/document",
        "inputs": [
            {
                "name": "text",
                "source": "/document/pages/*/text"
            }
        ],
        "outputs": [
            {
                "name": "keyPhrases",
                "targetName": "keyPhrases"
            }
        ]
    }
]
  • 此技能组将从
    "image"
    字段中提取 Base64 图像数据并将其存储在
    "imageData"
    字段中。

更新索引器:

"parameters": {
    "configuration": {
        "dataToExtract": "contentAndMetadata",
        "imageAction": "generateNormalizedImages",
        "indexedFileNameExtensions": ".pdf,.docx,.pptx,.xlsx",
        "skillsetName": "your_updated_skillset_name",
        "targetIndexName": "your_index_name",
        "fieldMappings": [
            {
                "sourceFieldName": "/document/pages/*/text",
                "targetFieldName": "text"
            },
            {
                "sourceFieldName": "/document/pages/*/title",
                "targetFieldName": "title"
            }
        ]
    }
}

enter image description here

enter image description here


0
投票

您的指数预测定义错误。首先,您要在“图像”映射中创建嵌套输入。仅当“image”字段的类型为 Edm.ComplexType 并且您想要创建要映射到索引的内联复杂类型时,才应使用此选项。另外,看起来您正在映射“/document/normalized_images/*/pages/data”。您需要从该源路径中删除“页面”。因此,之后,索引投影定义中的特定映射应该如下所示:

{
        "name": "image",
        "source":"/document/normalized_images/*/data"
}

但是,请注意索引投影的 sourceContext 是“/document/pages/*”。这意味着对于每一“页面”,搜索索引中都会有一个文档。但是,图像在单独的路径“/document/normalized_images/*”下进行跟踪。这意味着页面到图像不一定是 1-1 的映射。因此,如果您使用我上面共享的映射,它实际上会输出一个字符串数组,其中包含该文档中每个页面的父文档中所有图像的单独的 Base64 数据。

如果您希望从图像到搜索文档存在 1-1 映射,那么您应该利用您的技能执行类似的操作。请注意,如果每个图像的文本输出太多而无法矢量化,那么您将看到错误。

{
  "description": "Skillset to chunk documents by image and generate embeddings",
  "skills": [
    {
      "@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
      "context": "/document/normalized_images/*",
      "inputs": [
        {
          "name": "image",
          "source": "/document/normalized_images/*"
        }
      ],
      "outputs": [
        {
          "name": "text",
          "targetName": "text"
        }
      ]
    },
    {
      "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
      "context": "/document/normalized_images/*",
      "resourceUri": "<fill in>",
      "apiKey": "<fill in>",
      "deploymentId": "<fill in>",
      "inputs": [
        {
          "name": "text",
          "source": "/document/normalized_images/*/text"
        }
      ],
      "outputs": [
        {
          "name": "embedding",
          "targetName": "vector"
        }
      ]
    }
  ],
  "cognitiveServices": null,
  "indexProjections": {
    "selectors": [
      {
        "targetIndexName": "name",
        "parentKeyFieldName": "parent_id",
        "sourceContext": "/document/normalized_images/*",
        "mappings": [
          {
            "name": "chunk",
            "source": "/document/normalized_images/*/text"
          },
          {
            "name": "vector",
            "source": "/document/normalized_images/*/vector"
          },
          {
            "name": "title",
            "source": "/document/metadata_storage_name"
          },
          {
            "name": "image",
            "source": "/document/normalized_images/*/data"
          }
        ]
      }
    ],
    "parameters": {
      "projectionMode": "skipIndexingParentDocuments"
    }
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.