Elasticsearch / Nest性能问题

问题描述 投票:0回答:1

我注意到NEST库中ISearchResponse.HitsMetadata.Total属性的行为很奇怪。每当我删除文档异步并希望立即从Elasticsearch检索剩余文档时,ISearchResponse对象上可用的HitsMetadata.Total字段几乎从未正确更新。它通常表示删除操作之前的文档总数。当我暂停执行至少700毫秒的请求时,行为恢复正常,好像NEST(或者Elasticsearch本身)需要更多时间来更新属性的状态。我是新手使用NEST和Elasticsearch所以我可能在这里做错了或者我可能不完全理解库的工作方式,但我花了很多时间来解决这个问题并且无法获得周围。结果,我发送给客户端的分页元数据被错误地计算。我正在使用NEST 6.6.0和Elasticsearch 6.6.2。

DELETE操作:

[HttpDelete("errors/{index}/{logeventId}")]
public async Task<IActionResult> DeleteErrorLog([FromRoute] string index, [FromRoute] string logeventId)
{
    if (string.IsNullOrEmpty(index))
    {
        return BadRequest();
    }

    if (string.IsNullOrEmpty(logeventId)) 
    {
        return BadRequest();
    }

    var getResponse = await _client.GetAsync<Logevent>(new GetRequest(index, typeof(Logevent), logeventId));

    if(!getResponse.Found)
    {
        return NotFound();
    }

    var deleteResponse = await _client.DeleteAsync(new DeleteRequest(index, typeof(Logevent), logeventId));

    if (!deleteResponse.IsValid)
    {
        throw new Exception($"Deleting document id {logeventId} failed");
    }

    return NoContent();

}

GET行动:

[HttpGet("errors/{index}", Name = "GetErrors")]
public async Task<IActionResult> GetErrorLogs([FromRoute] string index, 
    [FromQuery]int pageNumber = 1, [FromQuery] int pageSize = 5)
{
    if (string.IsNullOrEmpty(index))
    {
        return BadRequest();
    }

    if(pageSize > MAX_PAGE_SIZE || pageSize < 1)
    {
        pageSize = 5;
    }

    if(pageNumber < 1)
    {
        pageNumber = 1;
    }

    var from = (pageNumber - 1) * pageSize;

    ISearchResponse<Logevent> searchResponse = await GetSearchResponse(index, from, pageSize);

    if (searchResponse.Hits.Count == 0)
    {
        return NotFound();
    }

    int totalPages = GetTotalPages(searchResponse, pageSize);

    var previousPageLink = pageNumber > 1 ? 
        CreateGetLogsForIndexResourceUri(ResourceUriType.PreviousPage, pageNumber, pageSize, "GetErrors") : null;

    var nextPageLink = pageNumber < totalPages ? 
        CreateGetLogsForIndexResourceUri(ResourceUriType.NextPage, pageNumber, pageSize, "GetErrors") : null;

    /* HERE, WHEN EXECUTED IMMMEDIATELY (UP TO 700 MILISSECONDS, THE 
       totalCount FIELD GETS MISCALCULATED AS IT RETURNS THE VALUE PRECEDING 
       THE DELETION OF A DOCUMENT 
    */
    var totalCount = searchResponse.HitsMetadata.Total;
    var count = searchResponse.Hits.Count;

    var paginationMetadata = new
    {
        totalCount = searchResponse.HitsMetadata.Total,
        totalPages,
        pageSize,
        currentPage = pageNumber,
        previousPageLink,
        nextPageLink
    };

    Response.Headers.Add("X-Pagination", Newtonsoft.Json.JsonConvert.SerializeObject(paginationMetadata));

    var logeventsDtos = Mapper.Map<IEnumerable<LogeventDto>>(searchResponse.Hits);

    return Ok(logeventsDtos);
}

GetSearchResponseMethod:

private async Task<ISearchResponse<Logevent>> GetSearchResponse(string index, int from, int pageSize)
{
    return await _client.SearchAsync<Logevent>(s =>
             s.Index(index).From(from).Size(pageSize).Query(q => q.MatchAll()));

} 

客户端上启动服务器端操作的代码:

async deleteLogevent(item){
    this.deleteDialog = false;
    let logeventId = item.logeventId;
    let level = this.defaultSelected.name;
    let index = 'logstash'.concat('-', this.defaultSelected.value, '-', this.date);

    LogsService.deleteLogevent(level, index, logeventId).then(response => {
      if(response.status == 204){
        let logeventIndex = this.logs.findIndex(element => {return element.logeventId === item.logeventId});
        this.logs.splice(logeventIndex, 1);
        LogsService.getLogs(level, index, this.pageNumber).then(reloadResponse => {
          this.logs.splice(0);
          reloadResponse.data.forEach(element => {
          this.logs.push(element)
          });
          this.setPaginationMetadata(reloadResponse.headers["x-pagination"]);
        })
      }
    }).catch(error => {

    })
elasticsearch nest
1个回答
0
投票

这是Elasticsearch的正常行为和预期行为。在发生刷新间隔之前,对索引,更新和删除等操作的更改不会反映在对搜索请求的响应中。 Mike McCandless' blog post on how Lucene handles deleted documents现在已经有几年了,但仍然具有相关性。 Elasticsearch关于Near Real-Time Search的部分的​​在线权威指南也是一个很好的资源。

这是一个演示行为的示例

private static void Main()
{
    var defaultIndex = "refresh_example";
    var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));

    var settings = new ConnectionSettings(pool)
        .DefaultIndex(defaultIndex)
        .DefaultTypeName("_doc");

    var client = new ElasticClient(settings);

    if (client.IndexExists(defaultIndex).Exists)
        client.DeleteIndex(defaultIndex);

    client.CreateIndex(defaultIndex, c => c
        .Mappings(m => m
            .Map<Document>(mm => mm
                .AutoMap()
            )
        )
    );

    var indexResponse = client.IndexDocument(new Document 
    {
        Id = 1,
        Name = "foo"
    });

    // hit count is likely to be 0 here because no refresh interval has occurred
    var searchResponse = client.Search<Document>();
    Console.WriteLine($"search hit count after index no refresh: {searchResponse.Hits.Count}");

    // a get for the exact document will return it however.
    var getResponse = client.Get<Document>(1);
    Console.WriteLine($"get document with id 1, name is: {getResponse.Source.Name}");

    // use refresh API to refresh the index
    var refreshResponse = client.Refresh(defaultIndex);

    // now the hit count is 1
    searchResponse = client.Search<Document>();
    Console.WriteLine($"search hit count after refresh: {searchResponse.Hits.Count}");

    // index another document, and refresh at the same time
    indexResponse = client.Index(new Document
    {
        Id = 2,
        Name = "bar"
    }, i => i.Refresh(Refresh.WaitFor));

    // now the hit count is 2
    searchResponse = client.Search<Document>();
    Console.WriteLine($"search hit count after index with refresh: {searchResponse.Hits.Count}");

    // now delete document with id 1
    var deleteResponse = client.Delete<Document>(1);
    Console.WriteLine($"document with id 1 deleted");

    // hit count is still 2
    searchResponse = client.Search<Document>();
    Console.WriteLine($"search hit count before refresh: {searchResponse.Hits.Count}");

    // refresh
    refreshResponse = client.Refresh(defaultIndex);

    // hit count is 1
    searchResponse = client.Search<Document>();
    Console.WriteLine($"search hit count after refresh: {searchResponse.Hits.Count}");
}

public class Document 
{
    public int Id { get; set; }

    public string Name { get;set; }
}

这是写入控制台的内容

search hit count after index no refresh: 0
get document with id 1, name is: foo
search hit count after refresh: 1
search hit count after index with refresh: 2
document with id 1 deleted
search hit count before refresh: 2
search hit count after refresh: 1

你可能会想,“为什么我不刷新每一个操作?”。不是表现的原因;当您调用刷新API或指定刷新作为操作的一部分时,将编写并打开一个新段,该段使用系统资源,需要提交到磁盘,以后可能与其他段合并。不断调用刷新会产生很多细分。但是,在测试中调用refresh来进行断言很有用。

最好通过删除和搜索来编写应用程序以处理近乎实时的性质。对于分页,这也是其他数据存储的类似场景。

© www.soinside.com 2019 - 2024. All rights reserved.