索引solr数据时无法修剪尾随空格?

问题描述 投票:0回答:1

我有3个zookeeper和2个solr实例的solr云设置。我试图通过dih将数据从xml文件(嵌套文档)索引到solr并尝试删除尾随空格,以便在搜索之后,它不应显示空格。

文件样本:

<doc>
   <sku>...</sku>
   <data>
     <date>..</date>
     <store>..</store>
    <econn>..</econn>
   </data>
</doc>
...
...
</product>

i have not shared the DIH , as it is working fine.

i have tried both links :- 

https://stackoverflow.com/questions/24570545/is-it-possible-to-get-solrs-dataimporthadler-to-ignore-fields-with-empty-string

https://fossies.org/linux/solr/solr/example/example-DIH/solr/atom/conf/solrconfig.xml

actual file :-
<doc>
   <sku>abc </sku>
   <data>
      <date>2019-19-08</date>
      <store>somestore </store>
     <econn>false </econn>
   </data>
</doc>

expected output after indexing:- 
<doc>
   <sku>abc</sku>
   <data>
     <date>2019-19-08</date>
     <store>somestore</store>
     <econn>false</econn>
   </data>
</doc>

both parent and child trailing spaces should be trimmed or either of those ,which depends on context.
solr solrcloud
1个回答
0
投票

对我有用的最佳解决方案是在data-config.xml文件中应用regexTransformer。

<entity name="foo" transformer="RegexTransformer" 
<field column="new_field" xpath="path/to/field/in/xml" regex="(\s|\t)" replaceWith="" />
...
...
...
...
</entity>

有时答案很简单而且很棒!!!!!!!

© www.soinside.com 2019 - 2024. All rights reserved.