我的 .NET Core 3.1 应用程序使用 Polly 7.1.0 重试和舱壁策略来实现 http 弹性。重试策略使用
HandleTransientHttpError()
来捕获可能的 HttpRequestException
。
现在使用
MyClient
触发的 http 请求有时会返回 HttpRequestException
。其中大约一半被波莉抓住并重审。然而,另一半最终出现在我的 try-catch
块中,我必须手动重试它们。这种情况发生在最大重试次数用尽之前。
我如何设法创建一个竞争条件以防止 Polly 捕获所有异常?我该如何解决这个问题?
我使用 注册政策,如下。public void ConfigureServices(IServiceCollection services)
{
services.AddHttpClient<MyClient>(c =>
{
c.BaseAddress = new Uri("https://my.base.url.com/");
c.Timeout = TimeSpan.FromHours(5); // Generous timeout to accomodate for retries
})
.AddPolicyHandler(GetHttpResiliencePolicy());
}
private static AsyncPolicyWrap<HttpResponseMessage> GetHttpResiliencePolicy()
{
var delay = Backoff.DecorrelatedJitterBackoffV2(medianFirstRetryDelay: TimeSpan.FromSeconds(1), retryCount: 5);
var retryPolicy = HttpPolicyExtensions
.HandleTransientHttpError() // This should catch HttpRequestException
.OrResult(msg => msg.StatusCode == HttpStatusCode.NotFound)
.WaitAndRetryAsync(
sleepDurations: delay,
onRetry: (response, delay, retryCount, context) => LogRetry(response, retryCount, context));
var throttlePolicy = Policy.BulkheadAsync<HttpResponseMessage>(maxParallelization: 50, maxQueuingActions: int.MaxValue);
return Policy.WrapAsync(retryPolicy, throttlePolicy);
}
正在触发 http 请求的 MyClient
如下所示。
public async Task<TOut> PostAsync<TOut>(Uri requestUri, string jsonString)
{
try
{
using (var content = new StringContent(jsonString, Encoding.UTF8, "application/json"))
using (var response = await httpClient.PostAsync(requestUri, content)) // This throws HttpRequestException
{
// Handle response
}
}
catch (HttpRequestException ex)
{
// This should never be hit, but unfortunately is
}
}
这里有一些附加信息,尽管我不确定它是否相关。
HttpClient
是DI 暂时注册的,因此每个工作单元有 10 个它的实例。
HttpRequestException
HttpRequestException
)。
WebException
)。
“大约一半的人被波莉抓住并重审。如果您的某些重试次数用完,则可能会发生这种情况。换句话说,有些请求在 6 次尝试(5 次重试和 1 次初始尝试)中无法成功。然而,另一半最终出现在我的 try-catch-block 中”
这可以使用以下两个工具之一轻松验证:
onRetry
+
context
Fallback
+
context
onRetry
+
context
onRetry
。代表收到
retryCount
。因此,为了能够连接/关联同一请求的单独日志条目,您需要使用某种相关 ID。最简单的方法可以像这样编码:
public static class ContextExtensions
{
private const string Key = "CorrelationId";
public static Context SetCorrelation(this Context context, Guid? id = null)
{
context[Key] = id ?? Guid.NewGuid();
return context;
}
public static Guid? GetCorrelation(this Context context)
{
if (!context.TryGetValue(Key, out var id))
return null;
if (id is Guid correlation)
return correlation;
return null;
}
}
这是一个简化的示例:
待执行的方法
private async Task<string> Test()
{
await Task.Delay(1000);
throw new CustomException("");
}
政策
var retryPolicy = Policy<string>
.Handle<CustomException>()
.WaitAndRetryAsync(5, _ => TimeSpan.FromSeconds(1),
(result, delay, retryCount, context) =>
{
var id = context.GetCorrelation();
Console.WriteLine($"{id} - #{retryCount} retry.");
});
使用方法
var context = new Context().SetCorrelation();
try
{
await retryPolicy.ExecuteAsync(async (ctx) => await Test(), context);
}
catch (CustomException)
{
Console.WriteLine($"{context.GetCorrelation()} - All retry has been failed.");
}
示例输出
3319cf18-5e31-40e0-8faf-1fba0517f80d - #1 retry.
3319cf18-5e31-40e0-8faf-1fba0517f80d - #2 retry.
3319cf18-5e31-40e0-8faf-1fba0517f80d - #3 retry.
3319cf18-5e31-40e0-8faf-1fba0517f80d - #4 retry.
3319cf18-5e31-40e0-8faf-1fba0517f80d - #5 retry.
3319cf18-5e31-40e0-8faf-1fba0517f80d - All retry has been failed.
Fallback
这是一个简化的示例:
政策
var fallbackPolicy = Policy<string>
.Handle<CustomException>()
.FallbackAsync(async (result, ctx, ct) =>
{
await Task.FromException<CustomException>(result.Exception);
return result.Result; //it will never be executed << just to compile
},
(result, ctx) =>
{
Console.WriteLine($"{ctx.GetCorrelation()} - All retry has been failed.");
return Task.CompletedTask;
});
使用方法
var context = new Context().SetCorrelation();
try
{
var strategy = Policy.WrapAsync(fallbackPolicy, retryPolicy);
await strategy.ExecuteAsync(async (ctx) => await Test(), context);
}
catch (CustomException)
{
Console.WriteLine($"{context.GetCorrelation()} - All policies failed.");
}
示例输出
169a270e-acf7-45fd-8036-9bd1c034c5d6 - #1 retry.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - #2 retry.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - #3 retry.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - #4 retry.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - #5 retry.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - All retry has been failed.
169a270e-acf7-45fd-8036-9bd1c034c5d6 - All policies failed.
// Manage connection errors
this.reconnectPolicy = Policy
.Handle<SocketException>()
.Or<IOException>() // TODO - IOException is not been catch and throws 502 Bad Gateway Exception.
.WaitAndRetryAsync(connectAttempts,
(retryAttempt) => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)),
(exception, timeSpan, retryCount, context) =>
{
var log = Log.ForContext("correlationId", context.CorrelationId);
log.Warning(exception, "Lost connection to {serverName}. Retry {reconnectAttempt} in {reconnectWaitSeconds} seconds.", context["serverName"], retryCount, timeSpan.TotalSeconds);
});
重启 TCP 监听服务后,第一个请求总是失败,后续请求成功。
// Use a policy to auto-reconnect the connection
var contextData = new Dictionary<string, object>()
{
["serverName"] = serverName
};
var policyResult = await this.reconnectPolicy.ExecuteAndCaptureAsync(
async (context, innerCancellationToken) =>
{
// Use existing TCP connection if possible, otherwise make a new connection
var networkStream = connection.GetTcpClient(false).GetStream();
await networkStream.WriteAsync(data, 0, data.Length, innerCancellationToken);
return networkStream;
},
contextData,
cancellationToken);
// Interpret the policy result
if (policyResult.Outcome == OutcomeType.Successful)
{
return policyResult.Result;
}
else
{
throw policyResult.FinalException ?? new Exception($"Unhandled fault connecting to {serverName}: {policyResult.FaultType}.");
}
堆栈跟踪:
Error Message [Unable to read data from the transport connection: An established connection was aborted by the software in your host machine.] There was a problem sending the HL7 message to the remote server.
System.IO.IOException: Unable to read data from the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine
at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)
at System.IO.Stream.<>c.<BeginEndReadAsync>b__43_0(Stream stream, ReadWriteParameters args, AsyncCallback callback, Object state)
at System.Threading.Tasks.TaskFactory`1.FromAsyncTrim[TInstance,TArgs](TInstance thisRef, TArgs args, Func`5 beginMethod, Func`3 endMethod)
at System.IO.Stream.BeginEndReadAsync(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.Stream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)