如何在elk中标记一个文件？

Question

我想标记索引（帖子）的所有文档（60k）中的字段（文本）最好的方法是什么？

GET /_analyze
{
"analyzer" : "standard",
"text" : ["this is a test"]
}

我的 Django 应用程序中的标签云需要标记化文本

Answer 1

默认情况下，所有字符串数据都通过标准分析器

 索引为

text

和 keyword。要显式创建索引的映射，您可以使用以下 API 调用。

PUT my_index
{
  "mappings": {
    "properties": {
      "my_field_1": {
        "type": "text",
        "analyzer": "standard"
      },
      "my_field_2": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

在这种情况下，索引到

my_field_1 and my_field_2

的所有数据都将有资格进行全文搜索。

如果您已经有索引，可以使用以下方法

使用 copy_to 功能并复制字段内的所有字段值，以使所有字段值都可在一个字段内搜索。
创建摄取管道并触发
```
update by query
```
API 调用。我在下面分享一个例子。

PUT my_index2/_doc/1
{
  "my_field_1": "musab dogan",
  "my_field_2": "elasticsearch opensearch"
}

PUT _ingest/pipeline/all_into_one
{
  "description": "Copy selected fields to a single new field",
  "processors": [
    {
      "script": {
        "source": """
          def newField = [];
          for (entry in ctx.entrySet()) {
            // Exclude fields starting with underscore
            if (!entry.getKey().startsWith("_")) {
              newField.add(entry.getKey() + ": " + entry.getValue());
            }
          }
          ctx['new_field'] = newField;
        """
      }
    }
  ]
}

POST my_index2/_update_by_query?pipeline=all_into_one

GET my_index2/_search
{
  "query": {
    "match": {
      "new_field": "musab"
    }
  }
}

运行

_update_by_query

API 调用后，所有现有数据都会更新。对于新的传入数据，您可以将摄取管道添加为default_pipeline。

PUT my_index/_settings
{
  "index.default_pipeline": "all_into_one"
}

如何在elk中标记一个文件？

问题描述投票：0回答：1

1个回答

如果您已经有索引，可以使用以下方法

最新问题

如何在elk中标记一个文件？

问题描述 投票：0回答：1

1个回答

如果您已经有索引，可以使用以下方法

最新问题

问题描述投票：0回答：1