过滤关于数组ElasticSearch中项目的文档

问题描述 投票:0回答:1

我正在使用ElasticSearch搜索文档。但是,我需要确保当前用户能够看到这些文档。每个文档都绑定到用​​户可能属于的社区。

这是我的文档的映射:

export const mapping = {
  properties: {
    amazonId: { type: 'text' },
    title: { type: 'text' },
    subtitle: { type: 'text' },
    description: { type: 'text' },
    createdAt: { type: 'date' },
    updatedAt: { type: 'date' },
    published: { type: 'boolean' },
    communities: { type: 'nested' }
  }
}

我目前正在以字符串数组的形式保存文档所属社区的ID。例如:["edd05cd0-0a49-4676-86f4-2db913235371", "672916cf-ee32-4bed-a60f-9a7c08dba04b"]

目前,当我使用{term: { communities: community.id } }过滤查询时,它会返回所有文档,而不管它所绑定的社区。

这是完整的查询:

{
  index: 'document',
  filter_path: { filter: {term: { communities: community.id } } },
  body: {
    sort: [{ createdAt: { order: 'asc' } }]
  }
}

这是基于"b7d28e7f-7534-406a-981e-ddf147b5015a"的社区ID的以下结果。注意:这是我的graphql的返回,因此在解析来自ES查询的命中后,文档上的社区是实际的完整对象。

"hits": [
    {
      "title": "The One True Document",
      "communities": [
        {
          "id": "edd05cd0-0a49-4676-86f4-2db913235371"
        },
        {
          "id": "672916cf-ee32-4bed-a60f-9a7c08dba04b"
        }
      ]
    },
    {
      "title": "Boring Document 1",
      "communities": []
    },
    {
      "title": "Boring Document 2",
      "communities": []
    },
    {
      "title": "Unpublished",
      "communities": [
        {
          "id": "672916cf-ee32-4bed-a60f-9a7c08dba04b"
        }
       ]
    }
]

当我尝试将社区映射为{type: 'keyword', index: 'not_analyzed'}时,我收到一条错误,指出[illegal_argument_exception] Could not convert [communities.index] to boolean

那么我需要更改映射,过滤器或两者吗?搜索docs for 6.6,我看到terms需要non_analyzed映射。

更新--------------------------

我更新了社区映射为keyword,如下所示。但是,我仍然收到了同样的结果。

我将查询更新为以下内容(使用包含文档的社区ID):

query: { index: 'document',
  body: 
   { sort: [ { createdAt: { order: 'asc' } } ],
     from: 0,
     size: 5,
     query: 
      { bool: 
         { filter: 
            { term: { communities: '672916cf-ee32-4bed-a60f-9a7c08dba04b' } } } } } }

这给了我以下结果:

{
  "data": {
    "communities": [
      {
        "id": "672916cf-ee32-4bed-a60f-9a7c08dba04b",
        "feed": {
          "documents": {
            "hits": []
          }
        }
      }
    ]
  }
}

看来我的过滤器工作得太好了?

elasticsearch elasticsearch-6
1个回答
1
投票

由于您要存储社区的ID,因此应确保不会分析ID。为此,communities应该是keyword类型。其次,您希望存储社区ID数组,因为用户可以属于多个社区。要做到这一点,你不需要使用nested类型。 Nested有不同的用例。要将值作为数组进行处理,您需要确保在索引时始终将值作为数组传递给字段,即使该值是单个值也是如此。

您需要更改映射以及您对字段communities索引值的方式。

1. Update mapping as below:
PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "amazonId": {
          "type": "text"
        },
        "title": {
          "type": "text"
        },
        "subtitle": {
          "type": "text"
        },
        "description": {
          "type": "text"
        },
        "createdAt": {
          "type": "date"
        },
        "updatedAt": {
          "type": "date"
        },
        "published": {
          "type": "boolean"
        },
        "communities": {
          "type": "keyword"
        }
      }
    }
  }
}
2. Adding a document to index:
PUT my_index/_doc/1
{
  "title": "The One True Document",
  "communities": [
    "edd05cd0-0a49-4676-86f4-2db913235371",
    "672916cf-ee32-4bed-a60f-9a7c08dba04b"
  ]
}
3. Filtering by community id:
GET my_index/_doc/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "communities": "672916cf-ee32-4bed-a60f-9a7c08dba04b"
          }
        }
      ]
    }
  }
}

Nested Field approach

1. Mapping:
PUT my_index_2
{
  "mappings": {
    "_doc": {
      "properties": {
        "amazonId": {
          "type": "text"
        },
        "title": {
          "type": "text"
        },
        "subtitle": {
          "type": "text"
        },
        "description": {
          "type": "text"
        },
        "createdAt": {
          "type": "date"
        },
        "updatedAt": {
          "type": "date"
        },
        "published": {
          "type": "boolean"
        },
        "communities": {
          "type": "nested"
        }
      }
    }
  }
}
2. Indexing document:
PUT my_index_2/_doc/1
{
  "title": "The One True Document",
  "communities": [
    {
      "id": "edd05cd0-0a49-4676-86f4-2db913235371"
    },
    {
      "id": "672916cf-ee32-4bed-a60f-9a7c08dba04b"
    }
  ]
}
3. Querying (used of nested query):
GET my_index_2/_doc/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "communities",
            "query": {
              "term": {
                "communities.id.keyword": "672916cf-ee32-4bed-a60f-9a7c08dba04b"
              }
            }
          }
        }
      ]
    }
  }
}

您可能会注意到我使用的是communities.id.keyword而不是communities.id。要了解这个的原因通过this

© www.soinside.com 2019 - 2024. All rights reserved.