Polly 重试并不总是捕获 HttpRequestException

问题描述 投票:0回答:2

我的 .NET Core 3.1 应用程序使用 Polly 7.1.0 重试和舱壁策略来实现 http 弹性。重试策略使用

HandleTransientHttpError()
来捕获可能的
HttpRequestException

现在使用

MyClient
触发的 http 请求有时会返回
HttpRequestException
。其中大约一半被波莉抓住并重审。然而,另一半最终出现在我的
try-catch
块中,我必须手动重试它们。这种情况发生在最大重试次数用尽之前。

我如何设法创建一个竞争条件以防止 Polly 捕获所有异常?我该如何解决这个问题?

我使用

IHttpClientFactory

 注册政策,如下。

public void ConfigureServices(IServiceCollection services) { services.AddHttpClient<MyClient>(c => { c.BaseAddress = new Uri("https://my.base.url.com/"); c.Timeout = TimeSpan.FromHours(5); // Generous timeout to accomodate for retries }) .AddPolicyHandler(GetHttpResiliencePolicy()); } private static AsyncPolicyWrap<HttpResponseMessage> GetHttpResiliencePolicy() { var delay = Backoff.DecorrelatedJitterBackoffV2(medianFirstRetryDelay: TimeSpan.FromSeconds(1), retryCount: 5); var retryPolicy = HttpPolicyExtensions .HandleTransientHttpError() // This should catch HttpRequestException .OrResult(msg => msg.StatusCode == HttpStatusCode.NotFound) .WaitAndRetryAsync( sleepDurations: delay, onRetry: (response, delay, retryCount, context) => LogRetry(response, retryCount, context)); var throttlePolicy = Policy.BulkheadAsync<HttpResponseMessage>(maxParallelization: 50, maxQueuingActions: int.MaxValue); return Policy.WrapAsync(retryPolicy, throttlePolicy); }
正在触发 http 请求的 

MyClient

 如下所示。

public async Task<TOut> PostAsync<TOut>(Uri requestUri, string jsonString) { try { using (var content = new StringContent(jsonString, Encoding.UTF8, "application/json")) using (var response = await httpClient.PostAsync(requestUri, content)) // This throws HttpRequestException { // Handle response } } catch (HttpRequestException ex) { // This should never be hit, but unfortunately is } }
这里有一些附加信息,尽管我不确定它是否相关。

    由于
  1. HttpClient
    DI 暂时注册的,因此每个工作单元有 10 个它的实例。
  2. 每个工作单元,客户端会触发约 400 个 http 请求。
  3. http 请求很长(5 分钟持续时间,30 MB 响应)
c# asp.net-core dotnet-httpclient polly retry-logic
2个回答
3
投票
重试,然后

HttpRequestException

每当我们谈论 Polly 政策时,我们都可以区分两种不同的例外情况:

    已处理
  • 未处理。
处理异常

    它会触发给定策略的某种行为(在本例中为
  • HttpRequestException
    )。
  • 如果策略不能成功,则再次抛出已处理的异常。
  • 如果有其他策略,则它可能会也可能不会处理该异常。
未处理的异常

    它不会引起任何类型的反应(例如在我们的例子中是
  • WebException
    )。
  • 未处理的异常通过策略流动。
  • 如果有其他策略,则它可能会也可能不会处理该异常。

“大约一半的人被波莉抓住并重审。

然而,另一半最终出现在我的 try-catch-block 中”

如果您的某些重试次数用完,则可能会发生这种情况。换句话说,有些请求在 6 次尝试(5 次重试和 1 次初始尝试)中无法成功。

这可以使用以下两个工具之一轻松验证:

  • onRetry
     + 
    context
    
    
  • Fallback
     + 
    context
    
    

onRetry
 + 
context

当重试策略被触发但在睡眠持续时间之前,会调用

onRetry

。代表收到 
retryCount
。因此,为了能够连接/关联同一请求的单独日志条目,您需要使用某种相关 ID。最简单的方法可以像这样编码:

public static class ContextExtensions { private const string Key = "CorrelationId"; public static Context SetCorrelation(this Context context, Guid? id = null) { context[Key] = id ?? Guid.NewGuid(); return context; } public static Guid? GetCorrelation(this Context context) { if (!context.TryGetValue(Key, out var id)) return null; if (id is Guid correlation) return correlation; return null; } }
这是一个简化的示例:

待执行的方法

private async Task<string> Test() { await Task.Delay(1000); throw new CustomException(""); }
政策

var retryPolicy = Policy<string> .Handle<CustomException>() .WaitAndRetryAsync(5, _ => TimeSpan.FromSeconds(1), (result, delay, retryCount, context) => { var id = context.GetCorrelation(); Console.WriteLine($"{id} - #{retryCount} retry."); });
使用方法

var context = new Context().SetCorrelation(); try { await retryPolicy.ExecuteAsync(async (ctx) => await Test(), context); } catch (CustomException) { Console.WriteLine($"{context.GetCorrelation()} - All retry has been failed."); }
示例输出

3319cf18-5e31-40e0-8faf-1fba0517f80d - #1 retry. 3319cf18-5e31-40e0-8faf-1fba0517f80d - #2 retry. 3319cf18-5e31-40e0-8faf-1fba0517f80d - #3 retry. 3319cf18-5e31-40e0-8faf-1fba0517f80d - #4 retry. 3319cf18-5e31-40e0-8faf-1fba0517f80d - #5 retry. 3319cf18-5e31-40e0-8faf-1fba0517f80d - All retry has been failed.

Fallback

正如所说,每当策略无法成功时,它将重新抛出已处理的异常。换句话说,如果一项政策失败,那么它会将问题升级到下一个级别(下一个外部政策)。

这是一个简化的示例:

政策

var fallbackPolicy = Policy<string> .Handle<CustomException>() .FallbackAsync(async (result, ctx, ct) => { await Task.FromException<CustomException>(result.Exception); return result.Result; //it will never be executed << just to compile }, (result, ctx) => { Console.WriteLine($"{ctx.GetCorrelation()} - All retry has been failed."); return Task.CompletedTask; });
使用方法

var context = new Context().SetCorrelation(); try { var strategy = Policy.WrapAsync(fallbackPolicy, retryPolicy); await strategy.ExecuteAsync(async (ctx) => await Test(), context); } catch (CustomException) { Console.WriteLine($"{context.GetCorrelation()} - All policies failed."); }
示例输出

169a270e-acf7-45fd-8036-9bd1c034c5d6 - #1 retry. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - #2 retry. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - #3 retry. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - #4 retry. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - #5 retry. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - All retry has been failed. 169a270e-acf7-45fd-8036-9bd1c034c5d6 - All policies failed.
    

0
投票
我在使用 Polly 进行简单的 TCP 连接时遇到了同样的问题。

// Manage connection errors this.reconnectPolicy = Policy .Handle<SocketException>() .Or<IOException>() // TODO - IOException is not been catch and throws 502 Bad Gateway Exception. .WaitAndRetryAsync(connectAttempts, (retryAttempt) => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)), (exception, timeSpan, retryCount, context) => { var log = Log.ForContext("correlationId", context.CorrelationId); log.Warning(exception, "Lost connection to {serverName}. Retry {reconnectAttempt} in {reconnectWaitSeconds} seconds.", context["serverName"], retryCount, timeSpan.TotalSeconds); });
重启 TCP 监听服务后,第一个请求总是失败,后续请求成功。

// Use a policy to auto-reconnect the connection var contextData = new Dictionary<string, object>() { ["serverName"] = serverName }; var policyResult = await this.reconnectPolicy.ExecuteAndCaptureAsync( async (context, innerCancellationToken) => { // Use existing TCP connection if possible, otherwise make a new connection var networkStream = connection.GetTcpClient(false).GetStream(); await networkStream.WriteAsync(data, 0, data.Length, innerCancellationToken); return networkStream; }, contextData, cancellationToken); // Interpret the policy result if (policyResult.Outcome == OutcomeType.Successful) { return policyResult.Result; } else { throw policyResult.FinalException ?? new Exception($"Unhandled fault connecting to {serverName}: {policyResult.FaultType}."); }
堆栈跟踪:

Error Message [Unable to read data from the transport connection: An established connection was aborted by the software in your host machine.] There was a problem sending the HL7 message to the remote server. System.IO.IOException: Unable to read data from the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state) at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state) --- End of inner exception stack trace --- at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state) at System.IO.Stream.<>c.<BeginEndReadAsync>b__43_0(Stream stream, ReadWriteParameters args, AsyncCallback callback, Object state) at System.Threading.Tasks.TaskFactory`1.FromAsyncTrim[TInstance,TArgs](TInstance thisRef, TArgs args, Func`5 beginMethod, Func`3 endMethod) at System.IO.Stream.BeginEndReadAsync(Byte[] buffer, Int32 offset, Int32 count) at System.IO.Stream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
    
© www.soinside.com 2019 - 2024. All rights reserved.