Logstash 映射 - 嵌套属性中的重复值

问题描述 投票:0回答:0

当聚合映射中有超过 1 个嵌套类型属性时,有人可以帮忙映射吗?我们正在使用 8.0 版本并使用 Logstash 将数据从数据库同步到 ES 索引。

问题:当我在 Logstash 配置文件中映射超过 1 个嵌套类型属性时,我看到在文档中为嵌套类型属性创建了重复的数据。让我试着用下面的示例更好地解释。

索引映射

PUT test
{
  "settings": {
    "index.mapping.coerce": false
  },
  "mappings": {
    "dynamic": "strict",
    "properties" : {
    "agreementId" : {
          "type" : "text",
          "copy_to" : [
            "primaryFields"
          ]
        },
        "customers" : {
          "properties" : {
            "customerId" : {
              "type" : "keyword",
              "index" : false,
              "doc_values" : false
            },
        "customerAddresses" : {
              "type" : "nested",
              "properties" : {
                "custAddress" : {
                  "type" : "text"
                },
                "custAddressType" : {
                  "type" : "keyword",
                  "doc_values" : false
                }
              }
            },
        "phones" : {
              "properties" : {
                "phonenumber" : {
                  "type" : "text",
                  "copy_to" : [
                    "primaryFields"
                  ]
                },
                "phonetype" : {
                  "type" : "keyword",
                  "doc_values" : false
                }
              }
            }
          }
        }
    }
  }
}

在我们的数据库中,我们有一个协议号作为主键,它可以有超过 1 个客户资料(在这个场景中我们使用 1 个)。每个客户可以有多个电话和多个地址。根据查询,我的输出看起来像这样

**agreement**   **customer**    **Address**  **Addresstype**  **Contact**  **Contacttype**
  123456879        10            123 Main St.     Mailing     1111111111     Home
  123456789        10            123 Main St.     Mailing     2222222222     Cell
  123456789        10            456 South        Billing     1111111111     Home
  123456789        10            456 South        Billing     2222222222     Cell

在索引中创建文档时,这就是它的样子

{
    "took": 474,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "test",
                "_id": "123456789",
                "_score": null,
                "_source": {
                    "agreementId": 123456789,
                    "customers": [
                        {
                            "phones": [
                                {
                                    "phonetype": "Cell",
                                    "phonenumber": "2222222222"
                                },
                                {
                                    "phonetype": "Cell",
                                    "phonenumber": "2222222222"
                                },
                                {
                                    "phonetype": "Home",
                                    "phonenumber": "1111111111"
                                },
                                {
                                    "phonetype": "Home",
                                    "phonenumber": "1111111111"
                                }
                            ],
                            "customerAddresses": [
                                {
                                    "custAddressType": "Mailing",
                                    "custAddress": "123 Main St."
                                },
                                {
                                    "custAddressType": "Billing",
                                    "custAddress": "456 South"
                                },
                                {
                                    "custAddressType": "Mailing",
                                    "custAddress": "123 Main St."
                                },
                                {
                                    "custAddressType": "Billing",
                                    "custAddress": "456 South"
                                }
                            ]
                        }
                    ]
                },
                "sort": [
                    1713679200000
                ]
            }
        ]
    }
}

如您所见,电话和客户地址重复出现。这是在配置文件中定义映射的方式。

aggregate {
        task_id => "%{agreement}"
        code => "
                        map['agreementId'] = event.get('agreement')                       
                        
                         map['customers'] ||= []
                        if (event.get('customer') != nil)

                                customer_found = false
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer')
                                                customer_found = true
                                        end
                                }

                                if !customer_found
                                        map['customers'] << {
                                        'customerId' => event.get('customer')                          
                                        }
                                end
                                
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer') && event.get('Contact') != nil
                                                cus['phones'] ||=[]
                                                cus['phones'] << {
                                                'phonenumber' => event.get('Contact'),
                                                'phonetype' => event.get('Contacttype'),
                                                }
                                        end
                                }
                                
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer_id') && event.get('Address') != nil
                                                cus['customerAddresses'] ||=[]
                                                cus['customerAddresses'] << {
                                                'custAddress' => event.get('Address'),
                                                'custAddressType' => event.get('Addresstype'),
                                                }
                                        end
                                }
                        end
                                       
                        event.cancel()
            "
             push_previous_map_as_event => true
             timeout => 5
             timeout_tags => ['aggregated']
    }
    if "aggregated" not in [tags] {
            drop {}
        }
}

我什至尝试将“Phones”属性作为嵌套类型提及,但在重复中没有运气。

elasticsearch logstash logstash-configuration
© www.soinside.com 2019 - 2024. All rights reserved.