Elasticsearch：寻找一个领域与另一个领域的重叠部分

Question

我正在尝试一种在Elasticsearch中执行此操作的方法，而无需进行多次查询，或者在必要时使用_mget。

我有许多具有这种结构的文档：

{
  'location': 'Orlando',
  'agent_id': 395205, 
},
{
  'location': 'Miami',
  'agent_id': 391773,
},
{
  'location': 'Miami',
  'agent_id': 391773,
},
{
  'location': 'Tampa',
  'agent_id': 395205,
}

location的值是固定的，但是很多agent_id是唯一的。

我的最终目标是给定位置列表，找到所有位置中都存在的agent_id。因此，在上面的示例中，给定['Orlando', 'Tampa']，我们将其取回[395205]，因为两者都存在。一个位置可能有重复的agent_id（这是预期的行为），所以我不能使用计数（例如，向我显示agent_id出现n倍，而n = len(locations)的次数。

这里的另一个关键是，如果可能的话，我想实际返回匹配，而不是将其归入汇总存储桶。因此，理想情况下top_hits可以嵌套在某处。

我认为使用一些聪明的过滤方法，或者使用一些严格的评分方法，可能会做到这一点，但是我不确定该如何处理。我已经使用多个查询进行了这项工作，但是我发现该过程过于昂贵，并且希望尽可能简化它。我认识到，这实际上是不可能的。但是很好奇听到其他声音。

Answer 1

代理下的唯一位置计数可用于查找普通代理

查询：

{
  "query": { --> select docs with give location
    "terms": {
      "location.keyword": [
        "Orlando",
        "Tampa"
      ]
    }
  },
  "aggs": {
    "agents": {
      "terms": {
        "field": "agent_id",  ---> List of agents
        "size": 10
      },
      "aggs": {
        "location": {         ---> Unique locations under a agent
          "terms": {
            "field": "location.keyword",
            "size": 10
          }
        },
        "my_bucket": {
          "bucket_selector": {
            "buckets_path": {
              "count": "location._bucket_count" 
            },
            "script": "params.count==2" -->count of locations for agent, replace 2
                                        --> with needed count(number of locations)
          }
        }
      }
    }
  }
}

结果：

 [
      {
        "_index" : "index30",
        "_type" : "_doc",
        "_id" : "LXuksHABg1vns4B5FWL5",
        "_score" : 1.0,
        "_source" : {
          "location" : "Orlando",
          "agent_id" : 395205
        }
      },
      {
        "_index" : "index30",
        "_type" : "_doc",
        "_id" : "MHuksHABg1vns4B5OmKC",
        "_score" : 1.0,
        "_source" : {
          "location" : "Tampa",
          "agent_id" : 395205
        }
      }
    ]
  },
  "aggregations" : {
    "agents" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 395205,
          "doc_count" : 2,
          "location" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "Orlando",
                "doc_count" : 1
              },
              {
                "key" : "Tampa",
                "doc_count" : 1
              }
            ]
          }
        }
      ]
    }
  }

Elasticsearch：寻找一个领域与另一个领域的重叠部分

问题描述投票：0回答：1

1个回答

最新问题

Elasticsearch：寻找一个领域与另一个领域的重叠部分

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1