我是 Spring Data elasticsearch 的新手。我正在开发一个项目,其中我正在对不同项目中遇到的错误进行索引(仅作为示例)。
我想获取所有项目,以及每个项目中的错误数量。
这是我的文件:
@Data
@Document(indexName = "all_bugs")
public class Bug{
@Id
private String recordId;
private Project project;
private String bugSummary;
private String status;
// other fields omitted for brevity
}
这是
Project
班级
@Data
public class Project {
private String projectId;
private String name;
}
现在所有的 bug 都在 elasticsearch 中,我可以在 Kibana 控制台中执行此查询来获取 所有项目,以及每个项目中的 bug 数量
GET /all_bugs/_search
{
"size": 0,
"aggs": {
"distinct_projects": {
"terms": {
"field": "project.projectId",
"size": 10
},
"aggs": {
"project_details": {
"top_hits": {
"size": 1,
"_source": {
"includes": ["project.projectId", "project.name"]
}
}
}
}
}
}
}
虽然我知道我需要做得更好,但我面临的问题是在 Spring Data Elasticsearch 部分。这是我构建聚合的方法。
@Autowired
private ElasticsearchOperations elasticsearchOperations;
public List<DistinctProject> getDistinctProjects() {
TermsAggregationBuilder aggregation = AggregationBuilders
.terms("distinct_projects")
.field("projects.projectId")
.size(10)
.subAggregation(AggregationBuilders
.topHits("project_details")
.size(1)
.fetchSource(new String[]{"project.name", "project.projectId"}, null));
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withAggregations(aggregation)
.build();
SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class);
//I dont' know what to do from here...
}
现在,我带着
SearchHits<DistinctProject>
。 问题是,我如何从这里获取聚合来构建我的响应?在这种情况下,DistinctProject
只是一个DTO,我想在其中存储projectId
、name
和docCount
,以便我可以创建一个列表并将其返回给调用者。
现在,这里的问题是,到目前为止我浏览过的所有文档都建议我实现
searchHits.getAggregations().get("distinct_projects")
,但这在我们正在使用的 Spring Data Elasticsearch 4.4.11 中不可用。根据此处的文档,
SearchHits
org.elasticsearch.search.aggregations.Aggregations 了。相反,它现在包含 org.springframework.data.elasticsearch.core.AggregationsContainer 类的实例class does not contain the
因此,
searchHits.getAggregations().get("distinct_projects")
会抛出编译错误。我无法继续下去。
我还参考了 P.J.Meisch 的这个答案,但这也提到了 Spring Data Elasticsearch 的旧版本
如果有人可以帮助我摆脱这个困境,我将非常感激。
有关信息,我的 spring boot 版本是 2.7.11,Spring Data elasticsearch 版本是 4.4.11。
谢谢, 斯里拉姆
我已经测试了你的代码。遗憾的是,Spring Data Elasticsearch 中没有用于聚合的数据模型。但你可以把聚合数据当成json,自己解析。
@Test
public void testCreate(){
TermsAggregationBuilder aggregation = AggregationBuilders
.terms("distinct_projects")
.field("project.projectId") // your code here is wrong
.size(10)
.subAggregation(AggregationBuilders
.topHits("project_details")
.size(1)
.fetchSource(new String[]{"project.name", "project.projectId"}, null));
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withAggregations(aggregation)
.build();
SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class, IndexCoordinates.of("all_bugs2"));
System.out.println(JSONObject.toJSONString(searchHits.getAggregations()));
}
{
"asMap": {
"distinct_projects": {
"buckets": [{
"aggregations": {
"asMap": {
"project_details": {
"fragment": true,
"hits": {
"fragment": true,
"hits": [{
"documentFields": {},
"fields": {},
"fragment": false,
"highlightFields": {},
"id": "tqpfM4gBOyQu5gYl2sOB",
"matchedQueries": [],
"metadataFields": {},
"primaryTerm": 0,
"rawSortValues": [],
"score": 1.0,
"seqNo": -2,
"sortValues": [],
"sourceAsMap": {
"project": [{
"name": "my project",
"projectId": 10
}]
},
"sourceAsString": "{\"project\":[{\"name\":\"my project\",\"projectId\":10}]}",
"sourceRef": {
"fragment": true
},
"type": "_doc",
"version": -1
}],
"maxScore": 1.0,
"totalHits": {
"relation": 0,
"value": 1
}
},
"name": "project_details",
"type": "top_hits"
}
},
"fragment": true
},
"docCount": 1,
"docCountError": 0,
"fragment": true,
"key": 10,
"keyAsNumber": 10,
"keyAsString": "10"
}],
"docCountError": 0,
"fragment": true,
"name": "distinct_projects",
"sumOfOtherDocCounts": 0,
"type": "lterms"
}
},
"fragment": true
}
我有一个更新的 Spring 和 spring-data-elasticsearch 版本,但希望我的解决方案能帮助你。我能够像这样从 SearchHits 中提取聚合:
public List<NipSummary> getMergeSuggestionsByField(String fieldName) {
Query query = NativeQuery.builder()
.withAggregation("suggestions", Aggregation.of(a -> a
.terms(ta -> ta.field("nip.keyword").size(10).minDocCount(2))))
.build();
SearchHits<CustomerDataDTO> searchHits = elasticsearchOperations.search(query, CustomerDataDTO.class);
ElasticsearchAggregations aggregations = (ElasticsearchAggregations) searchHits.getAggregations();
assert aggregations != null;
List<StringTermsBucket> buckets = aggregations.aggregationsAsMap().get("suggestions").aggregation().getAggregate().sterms().buckets().array();
List<NipSummary> result = new ArrayList<>();
buckets.forEach(stringTermsBucket -> result.add(
NipSummary.builder()
.nip(stringTermsBucket.key().stringValue())
.count(stringTermsBucket.docCount())
.build()
));
return result;
}
NipSummary
是简单的DTO,用于存储每个聚合结果:
@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class NipSummary {
private String nip;
private Long count;
}
我有更简单的聚合,但我认为这可能是一个好的开始。这是最关键的部分:
ElasticsearchAggregations aggregations = (ElasticsearchAggregations) searchHits.getAggregations();
assert aggregations != null;
List<StringTermsBucket> buckets = aggregations.aggregationsAsMap().get("suggestions").aggregation().getAggregate().sterms().buckets().array();
其中“建议”是我的聚合的名称。另外,我不确定
StringTermsBucket
对于更复杂的聚合是否仍然有效。我通过挖掘调试器输出找到了它。