Hibernate elasticsearch音译(ICU转换)

问题描述 投票:0回答:1

我正在使用hibernate-search和hibernate-search-elasticsearch版本5.10.3.Final。我想在某些领域应用ICU转换。以下是elasticsearch文档中的过滤器:

https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/analysis-icu-transform.html

但我找不到hibernate-search依赖关系使用的lucene版本中的TokenFilterFactory。在TokenFilterDef中,工厂属性是必需的。有人知道如何通过hibernate-search实现音译吗?

java elasticsearch hibernate-search
1个回答
0
投票

您可以使用注释并依赖org.hibernate.search.elasticsearch.analyzer.ElasticsearchTokenFilterFactory来创建JSON令牌过滤器定义:

@AnalyzerDef(
    name = "myAnalyzer",
    tokenizer = ...,
    filter = @TokenFilterDef(
        name = "myLatinTransform",
        factory = ElasticsearchTokenFilterFactory.class,
        params = {
            @Parameter(name = "type", value = "'icu_transform'"),
            @Parameter(name = "id", value = "'Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC'")
        }
    )
)

注意:参数值被解释为JSON,因此必须引用字符串值。但是,为方便起见,允许使用单引号。

https://docs.jboss.org/hibernate/search/5.10/reference/en-US/html_single/#_custom_analyzers_using_the_code_analyzerdef_code_annotation

或者,您可以通过编程方式定义分析器,并从更自然的API中受益:

# In hibernate.properties
hibernate.search.elasticsearch.analysis_definition_provider com.acme.CustomAnalyzerProvider

public class CustomAnalyzerProvider implements ElasticsearchAnalysisDefinitionProvider {
    @Override
    public void register(ElasticsearchAnalysisDefinitionRegistryBuilder builder) {
        builder.analyzer( "myAnalyzer" )
                .withTokenizer( "whitespace" )
                .withTokenFilter( "myLatinTransform" );

        builder.tokenFilter( "myLatinTransform" )
                .type( "icu_transform" )
                .param( "id", "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC" );
    }
}

https://docs.jboss.org/hibernate/search/5.10/reference/en-US/html_single/#_custom_analyzers_using_a_definition_provider

© www.soinside.com 2019 - 2024. All rights reserved.