通过 SOLR 拼写检查预测更多字符

问题描述 投票:0回答:1

我想用 SOLR-Spellchecker 以 Google 实现的方式实现自动完成功能。例如,如果我输入“chocol”,我会得到建议“chocolate”、“chocolissimo”、“chocolate cake”…… 这意味着 SOLR 会在键入的术语中添加多个字符。

这是我的 SOLR 配置:

{
  "searchComponent":{
    "name": "spellcheckXXX",
    "class": "solr.SpellCheckComponent",
    "queryAnalyzerFieldType": "text_general",
    "spellchecker": {
        "name": "default",
        "field": "multi_term_lowercase_suggestion",
        "classname": "solr.DirectSolrSpellChecker",
        "distanceMeasure": "internal",
        "maxEdits":1,
        "minPrefix":1,
        "minQueryLength":3,
        "combineWords": "true",
        "comparatorClass": "freq"
    }
}


{
  "requestHandler":{
    "name":"/spellcheckXXX",
    "class":"solr.SearchHandler",
    "startup":"lazy",
    "defaults":{
        "spellcheck":"true",
        "spellcheck.dictionary":"default",
        "spellcheck.extendedResults":"true",
        "spellcheck.count":"50",
        "spellcheck.alternativeTermCount":"2",
        "spellcheck.maxResultsForSuggest":"50",
        "spellcheck.collate":"true",
        "spellcheck.collateExtendedResults":"true",
        "spellcheck.maxCollationTries":"100",
        "spellcheck.maxCollations":"50",
        "spellcheck.onlyMorePopular":"true",
        "rows": 0,
        "df": "multi_term_lowercase_suggestion"
    },
    "last-components":["spellcheckXXX"]
}


<field name="multi_term_lowercase_suggestion" type="multi_term_lowercase_suggestion_text" indexed="true" stored="true"/>


<fieldType name="multi_term_lowercase_suggestion_text" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

我的问题是我的配置,我只得到建议,这些建议只添加了一个字符,或者术语中的字符被其他字符替换。并不是真正的预测,更像是拼写纠正。

因此,如果术语是“Schoko”(德语:Schokolade 意思是巧克力),结果是 (.../spellcheckCLN?q=Schokol):

{
    "responseHeader": {
        "status": 0,
        "QTime": 0
    },
    "response": {
        "numFound": 0,
        "start": 0,
        "numFoundExact": true,
        "docs": []
    },
    "spellcheck": {
        "suggestions": [
            "schokol",
            {
                "numFound": 2,
                "startOffset": 0,
                "endOffset": 7,
                "origFreq": 0,
                "suggestion": [
                    {
                        "word": "school",
                        "freq": 41
                    },
                    {
                        "word": "schoko",
                        "freq": 13
                    }
                ]
            }
        ],
        "correctlySpelled": false,
        "collations": [
            "collation",
            {
                "collationQuery": "school",
                "hits": 21,
                "misspellingsAndCorrections": [
                    "schokol",
                    "school"
                ]
            },
            "collation",
            {
                "collationQuery": "schoko",
                "hits": 9,
                "misspellingsAndCorrections": [
                    "schokol",
                    "schoko"
                ]
            }
        ]
    }
}

如果术语是“Schoolad”(.../spellcheckCLN?q=Schoolad):

{
    "responseHeader": {
        "status": 0,
        "QTime": 1
    },
    "response": {
        "numFound": 0,
        "start": 0,
        "numFoundExact": true,
        "docs": []
    },
    "spellcheck": {
        "suggestions": [
            "schokolad",
            {
                "numFound": 1,
                "startOffset": 0,
                "endOffset": 9,
                "origFreq": 0,
                "suggestion": [
                    {
                        "word": "schokolade",
                        "freq": 34
                    }
                ]
            }
        ],
        "correctlySpelled": false,
        "collations": [
            "collation",
            {
                "collationQuery": "schokolade",
                "hits": 25,
                "misspellingsAndCorrections": [
                    "schokolad",
                    "schokolade"
                ]
            }
        ]
    }
}

因此存在“Schokolade”的结果,但当该术语短于一个以上字符时则不建议。我必须改变什么?

solr
1个回答
0
投票

我找到了问题的答案,所以我将关闭这个问题。

© www.soinside.com 2019 - 2024. All rights reserved.