我正在使用 SolrJ 来填充 solr 集群中的集合。在我的 schema.xml 中,字段
id_length
定义为:
<field name="id_length" type="int" indexed="true" stored="false"/>
我正在按 1000 个批次填充集合,我正在尝试存储 450k 文档:
public void write(SolrDocumentList page) throws SolrServerException, IOException, InterruptedException {
List<SolrInputDocument> toIndex = new LinkedList<>();
Iterator<SolrDocument> pageIt = page.iterator();
while (pageIt.hasNext()) {
SolrDocument resultDoc = pageIt.next();
SolrInputDocument inputDoc = new SolrInputDocument();
for (String name : resultDoc.getFieldNames()) {
inputDoc.addField(name, resultDoc.getFieldValue(name));
}
toIndex.add(inputDoc);
}
client.add(context.stringParams.get("targetCollection"), toIndex);
client.commit(context.stringParams.get("targetCollection"));
}
但是在一些批次之后,它抛出这个异常:
Exception in thread "main" org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http://localhost:8983/solr/collection_example_shard1_replica_n1: ERROR: [doc=b000001] multiple values encountered for non multiValued field id_length: [7, 7]
at org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:579)
at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1076)
at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934)
at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
这里,奇怪的是,它崩溃的输入文档是这样的:
SolrInputDocument(fields: [id=b000001, keyword=N/A, table=diagcodes, id_length=7])
很明显,
id_length
没有超过一个值,所以发生了什么????
我期待脚本在每个批次中都能成功运行,没有文档具有多值输入。