Elasticsearch“ignore_above”问题

问题描述 投票:0回答:1

索引映射(Kibana)

  • 获取/new_index/_mapping
  • 我已经将“ignore_above”重置为更大的尺寸,但当我查询搜索时,它似乎不适用于我的索引。
  • 我从其他解决方案中听说我需要重新索引或重新导入索引,但完成后我仍然一样。
  • 任何人都可以告诉我重新索引或重新导入的正确方法吗?
  • 我所做的是删除索引并首先进行映射,然后使用logstash重新索引
  • 这是正确的做法吗?为什么“ignored”:“library_notes”仍然会发生?
PUT /new_index
{
  "mappings": {
    "properties": {
      "items": {
        "type": "nested"
      },
      "contents": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 50000
          }
        }
      },
      "library_notes": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 50000
          }
        }
      },
      "image_path": {
        "type": "text",
        "fielddata": true
      }
    }
  }
}
[
  "library_language": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  },
  "library_notes": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 50000
      }
    }
  },
  "library_subject": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  }
]
Sample library_notes data(is an array)
"library_notes": [
"\"The Big Picture Show, 14 September 2007 - 23 March 2008, Singapore Art Museum\"--T.p. verso.",
"Artists: Wong Shih Yaw; Charlie Co; Entang Wiharso; Syed Thajudeen; Zakaria Omar; Somboon Hormtientong; Lee Hsin Hsin; Dang Xuan Hoa; Lim Tze Peng; Hong Sek Chern; Ferdinand Montemayor; Antonio (Tony) Leano; Wong Keen; Tan Chin Kuan; Pratuang Emjaroen; Jeremy Ramsey; Gao Xingjian; Marc Leguay; He Kongde; Edgar (Egai) Talusan Fernandez; Pacita Abad; Imelda Cajipe-Endaya; Suos Sodavy; Tin Tun Hlaing; Bayu Utomo Radjikin.",
" In putting together The Big picture Show, the Singapore Art Museum (SAM) has taken the opportunity to bring together for display some of its largest treasures in its collection."
],
the query that i used
{
        "from": 0,
        "size": 10000,
        "track_total_hits": true,
        "sort": [
            {},
            {
                "_script": {
                    "type": "number",
                    "script": {
                        "lang": "painless",
                        "source": "doc.containsKey('image_path') && doc['image_path'].size() > 0 ? 0 : 1"
                    }
                }
            },
            "_score"
        ],
        "query": {
            "function_score": {
                "query": {
                    "bool": {
                        "must": [
                            {
                                "match_phrase": {
                                    "category_code": "ART"
                                }
                            },
                            {
                                "bool": {
                                    "should": [
                                        {
                                            "wildcard": {
                                                "library_notes.keyword": {
                                                    "value": "lim *ze peng",
                                                    "case_insensitive": true
                                                }
                                            }
                                        },
                                        {
                                            "wildcard": {
                                                "linking_notes.keyword": {
                                                    "value": "lim *ze peng",
                                                    "case_insensitive": true
                                                }
                                            }
                                        }
                                    ]
                                }
                            },
                            
                            {
                                "match": {
                                    "is_available": true
                                }
                            },
                        ],
                        "should": []
                    }
                },
                "functions": [
                    {
                        "random_score": {},
                        "weight": 1
                    }
                ],
                "score_mode": "sum"
            }
        }
    },
}

即使library_notes字段中有Lim Tze Peng,通配符搜索也找不到“lim *tze peng”

elasticsearch elastic-stack elasticsearch-5 elasticsearch-aggregation elasticsearch-dsl
1个回答
0
投票

通配符需要匹配整个标记,在您的情况下,标记是整行:

"Artists: Wong Shih Yaw; Charlie Co; Entang Wiharso; Syed Thajudeen; Zakaria Omar; Somboon Hormtientong; Lee Hsin Hsin; Dang Xuan Hoa; Lim Tze Peng; Hong Sek Chern; Ferdinand Montemayor; Antonio (Tony) Leano; Wong Keen; Tan Chin Kuan; Pratuang Emjaroen; Jeremy Ramsey; Gao Xingjian; Marc Leguay; He Kongde; Edgar (Egai) Talusan Fernandez; Pacita Abad; Imelda Cajipe-Endaya; Suos Sodavy; Tin Tun Hlaing; Bayu Utomo Radjikin.",

因此,要回答您的具体问题,您可以通过使用带有前导和尾随通配符运算符的通配符来找到它

*

"value": "*lim ?ze peng*",

但是,如果没有巨大的免责声明,我无法给你这个建议。这是 Elasticsearch 中最慢的操作之一,尤其是在每个记录有多个不同值的字段上,就像您的情况一样。对于大多数用例来说,都有更好的替代方案。因此,我强烈鼓励您考虑为用户提供其他选项来实现他们想要的结果。

© www.soinside.com 2019 - 2024. All rights reserved.