Solr 建议者重复建议

问题描述 投票:0回答:2

我正在尝试使用 Solr(5) 的建议。建议有效,但我不断收到建议。 我尝试对建议进行分组,但不起作用。 如何防止重复出现建议?

这是我的schema.xml的必要部分:

<field name="Name" type="suggest" indexed="true" stored="true" multiValued="false"/>  
...
<fieldType name="suggest" class="solr.TextField">
  <analyzer type="index">        
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>             
        <filter class="solr.LowerCaseFilterFactory"/>           
        <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="15"/>              
  </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>      
        <filter class="solr.LowerCaseFilterFactory"/>           
      </analyzer>
</fieldType>

我的solrconfig.xml

<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
  <str name="name">mySuggester</str>    
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
  <str name="suggestAnalyzerFieldType">suggest</str>      
  <str name="exactMatchFirst">true</str>
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>      
  <str name="field">Name</str>
  <str name="weightField">Price</str>      
  <str name="buildOnCommit">true</str>        
  <str name="buildOnStartup">false</str>
  <str name="preserveSep">false</str>    
</lst>  

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">   
  <str name="suggest">true</str>
  <str name="suggest.count">5</str>
  <str name="suggest.dictionary">mySuggester</str>
  <str name="suggest.collate">true</str>     
</lst>
<arr name="components">
  <str>suggest</str>
  <str>query</str>    
</arr>

带有参数的“acer”建议的示例输出

/建议?&suggest.dictionary=mySuggester&suggest.q=acer

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6</int>
</lst>
<lst name="suggest">
<lst name="mySuggester">
<lst name="acer">
<int name="numFound">5</int>
<arr name="suggestions">
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2350</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2099</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2000</long>
<str name="payload"/>
</lst>
</arr>
</lst>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>

您可以看到建议 Acer V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3” 3次。

分组也不起作用:

建议?&suggest.dictionary=mySuggester&suggest.q=acer&group=true&group.field=名称

 <response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">90</int>
</lst>
<lst name="suggest">
<lst name="mySuggester">
<lst name="acer">
<int name="numFound">5</int>
<arr name="suggestions">
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2369</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-5421121TMAKK Intel Core i5 4210U 1.7GHz 12GB 1TB 17.3"
</str>
<long name="weight">2350</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2099</long>
<str name="payload"/>
</lst>
<lst>
<str name="term">
<b>Acer</b> V3-772G-542081TMamm Intel Core i5 4200M 2.5GHz / 3.1GHz 8GB 1TB 17.3"
</str>
<long name="weight">2000</long>
<str name="payload"/>
</lst>
</arr>
</lst>
</lst>
</lst>
<lst name="grouped">
<lst name="Name">
<int name="matches">0</int>
<arr name="groups"/>
</lst>
</lst>
</response>
solr autosuggest search-suggestion
2个回答
4
投票

您正在使用 DocumentDictionaryFactory 字典实现。它将存储针对每个文档的建议术语。因此,如果多个文档中存在相同的建议术语,则将提供所有这些实例。

为了防止这种情况,您可以

  1. 编写一个拦截 API,从 Solr 读取建议(例如:一次 30 个),然后在返回数据之前删除它们
  2. 使用其他字典,例如 FileDictionaryFactoryHighFrequencyDictionaryFactory

0
投票

或者,重复数据删除可以在客户端实现。例如,可以在 javascript 中轻松删除重复项,如下所示:-

let uniqueTermsArr = [...new Set(TermsArr)]; 
console.log(uniqueTermsArr);
© www.soinside.com 2019 - 2024. All rights reserved.