Spring Data Elasticsearch 4.4.x:如何从 SearchHits 获取聚合?

问题描述 投票:0回答:2

我是 Spring Data elasticsearch 的新手。我正在开发一个项目,其中我正在对不同项目中遇到的错误进行索引(仅作为示例)。

我想获取所有项目,以及每个项目中的错误数量。

这是我的文件:

@Data
@Document(indexName = "all_bugs")
public class Bug{
    @Id
    private String recordId;
    private Project project;
    private String bugSummary;
    private String status;
    // other fields omitted for brevity
}

这是

Project
班级

@Data
public class Project {
    private String projectId;
    private String name;
}

现在所有的 bug 都在 elasticsearch 中,我可以在 Kibana 控制台中执行此查询来获取 所有项目,以及每个项目中的 bug 数量

GET /all_bugs/_search
{
  "size": 0,
  "aggs": {
    "distinct_projects": {
      "terms": {
        "field": "project.projectId",
        "size": 10
      },
      "aggs": {
        "project_details": {
          "top_hits": {
            "size": 1,
            "_source": {
              "includes": ["project.projectId", "project.name"]
            }
          }
        }
      }
    }
  }
}

虽然我知道我需要做得更好,但我面临的问题是在 Spring Data Elasticsearch 部分。这是我构建聚合的方法。

    @Autowired
    private ElasticsearchOperations elasticsearchOperations;

    public List<DistinctProject> getDistinctProjects() {
        TermsAggregationBuilder aggregation = AggregationBuilders
                .terms("distinct_projects")
                .field("projects.projectId")
                .size(10)
                .subAggregation(AggregationBuilders
                        .topHits("project_details")
                        .size(1)
                        .fetchSource(new String[]{"project.name", "project.projectId"}, null));

        NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withAggregations(aggregation)
                .build();

        SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class);

//I dont' know what to do from here...
    }

现在,我带着

SearchHits<DistinctProject>
问题是,我如何从这里获取聚合来构建我的响应?在这种情况下,
DistinctProject
只是一个DTO,我想在其中存储
projectId
name
docCount
,以便我可以创建一个列表并将其返回给调用者。

现在,这里的问题是,到目前为止我浏览过的所有文档都建议我实现

searchHits.getAggregations().get("distinct_projects")
,但这在我们正在使用的 Spring Data Elasticsearch 4.4.11 中不可用。根据此处的文档

SearchHits

class does not contain the 
org.elasticsearch.search.aggregations.Aggregations 了。相反,它现在包含 org.springframework.data.elasticsearch.core.AggregationsContainer 类的实例

因此,

searchHits.getAggregations().get("distinct_projects")
会抛出编译错误。我无法继续下去。

我还参考了 P.J.Meisch 的这个答案,但这也提到了 Spring Data Elasticsearch 的旧版本

如果有人可以帮助我摆脱这个困境,我将非常感激。

有关信息,我的 spring boot 版本是 2.7.11,Spring Data elasticsearch 版本是 4.4.11。

谢谢, 斯里拉姆

elasticsearch spring-data-elasticsearch
2个回答
0
投票

我已经测试了你的代码。遗憾的是,Spring Data Elasticsearch 中没有用于聚合的数据模型。但你可以把聚合数据当成json,自己解析。

    @Test
        public void testCreate(){
            TermsAggregationBuilder aggregation = AggregationBuilders
                    .terms("distinct_projects")
                    .field("project.projectId") // your code here is wrong
                    .size(10)
                    .subAggregation(AggregationBuilders
                            .topHits("project_details")
                            .size(1)
                            .fetchSource(new String[]{"project.name", "project.projectId"}, null));

            NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
                    .withAggregations(aggregation)
                    .build();

            SearchHits<DistinctProject> searchHits = elasticsearchOperations.search(searchQuery, DistinctProject.class, IndexCoordinates.of("all_bugs2"));

            System.out.println(JSONObject.toJSONString(searchHits.getAggregations()));
        }

    {
            "asMap": {
                    "distinct_projects": {
                            "buckets": [{
                                    "aggregations": {
                                            "asMap": {
                                                    "project_details": {
                                                            "fragment": true,
                                                            "hits": {
                                                                    "fragment": true,
                                                                    "hits": [{
                                                                            "documentFields": {},
                                                                            "fields": {},
                                                                            "fragment": false,
                                                                            "highlightFields": {},
                                                                            "id": "tqpfM4gBOyQu5gYl2sOB",
                                                                            "matchedQueries": [],
                                                                            "metadataFields": {},
                                                                            "primaryTerm": 0,
                                                                            "rawSortValues": [],
                                                                            "score": 1.0,
                                                                            "seqNo": -2,
                                                                            "sortValues": [],
                                                                            "sourceAsMap": {
                                                                                    "project": [{
                                                                                            "name": "my project",
                                                                                            "projectId": 10
                                                                                    }]
                                                                            },
                                                                            "sourceAsString": "{\"project\":[{\"name\":\"my project\",\"projectId\":10}]}",
                                                                            "sourceRef": {
                                                                                    "fragment": true
                                                                            },
                                                                            "type": "_doc",
                                                                            "version": -1
                                                                    }],
                                                                    "maxScore": 1.0,
                                                                    "totalHits": {
                                                                            "relation": 0,
                                                                            "value": 1
                                                                    }
                                                            },
                                                            "name": "project_details",
                                                            "type": "top_hits"
                                                    }
                                            },
                                            "fragment": true
                                    },
                                    "docCount": 1,
                                    "docCountError": 0,
                                    "fragment": true,
                                    "key": 10,
                                    "keyAsNumber": 10,
                                    "keyAsString": "10"
                            }],
                            "docCountError": 0,
                            "fragment": true,
                            "name": "distinct_projects",
                            "sumOfOtherDocCounts": 0,
                            "type": "lterms"
                    }
            },
            "fragment": true
    }

0
投票

我有一个更新的 Spring 和 spring-data-elasticsearch 版本,但希望我的解决方案能帮助你。我能够像这样从 SearchHits 中提取聚合:

public List<NipSummary> getMergeSuggestionsByField(String fieldName) {
    Query query = NativeQuery.builder()
        .withAggregation("suggestions", Aggregation.of(a -> a
        .terms(ta -> ta.field("nip.keyword").size(10).minDocCount(2))))
        .build();

    SearchHits<CustomerDataDTO> searchHits = elasticsearchOperations.search(query, CustomerDataDTO.class);
    ElasticsearchAggregations aggregations = (ElasticsearchAggregations) searchHits.getAggregations();
    assert aggregations != null;
    List<StringTermsBucket> buckets = aggregations.aggregationsAsMap().get("suggestions").aggregation().getAggregate().sterms().buckets().array();

    List<NipSummary> result = new ArrayList<>();

    buckets.forEach(stringTermsBucket -> result.add(
            NipSummary.builder()
                    .nip(stringTermsBucket.key().stringValue())
                    .count(stringTermsBucket.docCount())
                    .build()
    ));

    return result;
}

NipSummary
是简单的DTO,用于存储每个聚合结果:

@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class NipSummary {

    private String nip;
    private Long count;
}

我有更简单的聚合,但我认为这可能是一个好的开始。这是最关键的部分:

ElasticsearchAggregations aggregations = (ElasticsearchAggregations) searchHits.getAggregations();
assert aggregations != null;
List<StringTermsBucket> buckets = aggregations.aggregationsAsMap().get("suggestions").aggregation().getAggregate().sterms().buckets().array();

其中“建议”是我的聚合的名称。另外,我不确定

StringTermsBucket
对于更复杂的聚合是否仍然有效。我通过挖掘调试器输出找到了它。

© www.soinside.com 2019 - 2024. All rights reserved.