[我尝试使用Elastic Search
(版本6.8)从文本中查找最相似的标签,并且我希望得到相似标签的总和,而不是默认的弹性搜索的计算(公式)。
例如,我创建my_test_index并插入三个文档:
POST my_test_index/_doc/17
{
"id": 17,
"tags": ["devops", "server", "hardware"]
}
POST my_test_index/_doc/20
{
"id": 20,
"tags": ["software", "application", "developer", "develop"]
}
POST my_test_index/_doc/21
{
"id": 21,
"tags": ["electronic", "electric"]
}
没有映射,这是默认设置。
所以,我要求以下查询:
GET my_test_index/_search
{
"query": {
"more_like_this": {
"fields": [
"tags"
],
"like": [
"i like electric devices and develop some softwares."
],
"min_term_freq": 1,
"min_doc_freq": 1
}
}
}
并获得此响应:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "my_test_index",
"_type" : "_doc",
"_id" : "21",
"_score" : 0.2876821,
"_source" : {
"id" : 21,
"tags" : [
"electronic",
"electric"
]
}
},
{
"_index" : "my_test_index",
"_type" : "_doc",
"_id" : "20",
"_score" : 0.2876821,
"_source" : {
"id" : 20,
"tags" : [
"software",
"application",
"developer",
"develop"
]
}
}
]
}
}
但是,这对我来说不合适,我想计算类似以下标记的总分:我在文本和标签中有“ electric”字样,等于“ electric”标签,得到1.0分,与“ electrical”标签相似,得到〜0.7分。文字和标签中的“ develop”单词等于“ develop”标签,得分为1.0,与“ developer”标签相似,得分为〜0.8,与“ 软件“得分约为0.9,依此类推...
所以,我希望这个结果==> _id:20的总和==〜2.7,_id:21 =〜1.7和....
我希望有人可以提供一个示例,说明如何执行此操作,或者至少将我指出正确的方向。
谢谢。
我认为您没有在映射中的text
字段中使用tags
字段,这导致ID 20
和21
得分相同,我在映射中将其定义为text
并获得ID 21
的高分,这是预期的。
下面是我的解决方法。
{
"mappings": {
"properties": {
"id": {
"type": "integer"
},
"tags" : {
"type" : "text" --> note this
}
}
}
}
您提供的索引示例文档,并使用相同的搜索查询。
{
"query": {
"more_like_this": {
"fields": [
"tags"
],
"like": [
"i like electric devices and develop some softwares."
],
"min_term_freq": 1,
"min_doc_freq": 1
}
}
}
"hits": [
{
"_index": "so_array",
"_type": "_doc",
"_id": "3",
"_score": 1.135697, --> note score
"_source": {
"id": 21,
"tags": [
"electronic",
"electric"
]
}
},
{
"_index": "so_array",
"_type": "_doc",
"_id": "2",
"_score": 0.86312973, --> note score
"_source": {
"id": 20,
"tags": [
"software",
"application",
"developer",
"develop"
]
}
}
]