如何嵌套嵌套(Elasticsearch Mapping)

问题描述 投票:0回答:1

我想用弹性搜索映射映射下面的json:

JSON:

{"user_id":{
    "data_flow_id_1":[
        {"file_location": "C:/ewew","timestamp": "2019-01-01T00:00:00"},
        {"file_location": "C:/ewew2", "timestamp": "2019-02-01T00:00:00"}
            ],

    "data_flow_id_2":[
        {"file_location": "C:/ewew3","timestamp": "2019-03-01T00:00:00"},
        {"file_location": "C:/ewew4", "timestamp": "2019-04-01T00:00:00"}
            ]
}}

所以“user_id”'拥有'多个dataflow_ids,它们有自己的位置。到目前为止我有这个,但它并没有确切地说明了json所描述的内容 -

ES映射:

{
  "mappings": {
    "properties": {
      "dataflow_type": {
        "type": "nested",
          "properties": {
              "user_id": {"type": "string"},
              "data_flow_id": {"type": "string"},
              "file_location": {"type":"string"},
              "timestamp": {"type":"date"}
          }
      }
    }
  }
}

我正在努力将dataflow_id_ *位嵌套在user_id中 - 我是否需要在另一个嵌套中嵌套?

更新:这样的事可能吗?

{
  "mappings": {
    "properties": {
      "user_id": {
        "type": "nested",
          "properties": {
              "data_flow_id":{
                 "type": "nested",
                    "properties": 
                    {       "file_location": {"type": "text"},
                            "timestamp": {"type":"date"}
                    }
          }
      }
     }
    }
  }
}
json python-3.x elasticsearch relationship
1个回答
1
投票

我建议你使用下面的映射,以避免过多的嵌套。

PUT myindex
{
  "mappings": {
    "properties": {
      "user_id": {
        "type": "keyword"
      },
      "data_flow_id": {
        "type": "keyword"
      },
      "file_location": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "timestamp": {
        "type": "date"
      }
    }
  }
}

您必须将文档索引如下

PUT myindex/_doc/1
{
  "user_id": "some_id",
  "data_flow_id": "data_flow_id_1",
  "file_location": "C:/ewew",
  "timestamp": "2019-01-01T00:00:00"
}

同样,其他文件可以添加为:

PUT myindex/_doc/2
{"user_id":"some_id","data_flow_id":"data_flow_id_1","file_location":"C:/ewew2","timestamp":"2019-02-01T00:00:00"}

PUT myindex/_doc/3
{"user_id":"some_id","data_flow_id":"data_flow_id_2","file_location":"C:/ewew3","timestamp":"2019-03-01T00:00:00"}

PUT myindex/_doc/4
{"user_id":"some_id","data_flow_id":"data_flow_id_2","file_location":"C:/ewew4","timestamp":"2019-04-01T00:00:00"}

上述方法的缺点是,您需要为问题中提到的JSON索引4个文档,而不是2个文档。但这会导致搜索查询变得简单。另一方面,嵌套可能导致复杂的查询。

获取文档的示例查询,其中data_flow_iddata_flow_id_1

POST myindex/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "data_flow_id": "data_flow_id_1"
          }
        }
      ]
    }
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.