这有点令人沮丧,属于那种运行本地host没有问题,但部署到IIS上后,线程异常就开始爬行的情况。
总之,我使用的是Hangfire v1.7.11,后端存储是SQLServer。
有关的作业是用设置的。
await Task.Run(() =>
_jobClient.AddOrUpdate<ILiveDataService>(
notification.BmUnitGuidId.ToString(),
d => d.UpdateBmUnit(notification.BmUnitGuidId, CancellationToken.None),
"* * * * *"),
cancellationToken);
重要的部分是... CancellationToken.None
根据Hangfire文档传递进来的。
该 ILiveDataService
正在使用我的HttpClientFactory中的一个HttpClient设置 startup.cs
文件,我只是把 IDummyClient
这里,这应该是在做baseUri和验证头的通用设置。这应该是在做baseUri和认证头的通用设置。还有一个短暂的Http错误策略来处理片状连接。
services.AddHttpClient<IDummyClient, DummyClient>(
c =>
{
c.Timeout = TimeSpan.FromMilliseconds(500);
c.BaseAddress = new Uri(Configuration["DummyClient:Url"]);
var authInfo = Convert.ToBase64String(Encoding.GetEncoding("ISO-8859-1").GetBytes(Configuration["Dummy:User"] + ":" + Configuration["Dummy:Password"]));
c.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", authInfo);
})
.AddTransientHttpErrorPolicy(builder => builder.WaitAndRetryAsync(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(5),
TimeSpan.FromSeconds(10)
}));
在DummyClient中,被调用的方法是。
public async Task<KeyValuePair<DateTime, double?>> GetValues(string name, CancellationToken cancellationToken)
{
var dateFrom = RoundUp(this.DateTimeUtc, TimeSpan.FromMinutes(1));
using var response = await this._httpClient.GetAsync(
$"{paramterisedurl}",
HttpCompletionOption.ResponseHeadersRead,
cancellationToken);
var stream = await response.Content.ReadAsStreamAsync();
if (response.IsSuccessStatusCode)
{
var xmlDocument = new XmlDocument();
xmlDocument.Load(stream);
// Process horrendous XML response - it's too ugly to share :-)
return new KeyValuePair<DateTime, double?>(default, default);
}
var content = await StreamToStringAsync(stream);
throw new ApiException
{
StatusCode = (int)response.StatusCode,
Content = content
};
}
我从Hangfire的异常信息中可以看出,在这个过程中,这个任务已经死亡了 GetAsync()
呼叫。杭火的追踪如下。
System.Threading.Tasks.TaskCanceledException
The operation was canceled.
System.Threading.Tasks.TaskCanceledException: The operation was canceled.
at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
at Polly.AsyncPolicy`1.ExecuteAsync(Func`3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at Infrastructure.Sentinel.SentinelClient.GetBoaPhysicalNotification(String bmUnitName, CancellationToken cancellationToken) in /home/vsts/work/1/s/src/Infrastructure/Sentinel/SentinelClient.cs:line 97
at ApplicationCore.ApplicationServices.LiveDataService.LiveDataService.UpdateBmUnit(Guid bmUnitGuidId, CancellationToken cancellationToken) in /home/vsts/work/1/s/src/ApplicationCore/ApplicationServices/LiveDataService/LiveDataService.cs:line 81
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
但我发现一个不寻常的地方是 Hangfire显示的工作信息详细说明了... CancellationToken
作为 null
...
// Job ID: #140
using ApplicationCore.ApplicationServices.LiveDataService;
var liveDataService = Activate<ILiveDataService>();
await liveDataService.UpdateBmUnit(
FromJson<Guid>("\"fa832ce4-b2a5-47d1-9b04-6ffb52fa0f30\""),
null);
我想这里有很多问题可能会导致失败,但从根本上说,这似乎是 CancellationToken
并没有正确地传递到方法中,而且一旦被检查,就会在 ConnectAsync
的事情解开了。
正如我前面所说,这不会发生在本地主机上......只有在部署时才会发生。
从根本上来说,这是一个生产服务器没有被授权进行与localhost机器相同的调用的问题。
然而,在客户端抛出的异常被一般的异常所掩盖,所以诊断起来有点棘手。
结果诊断的方法是登录到生产箱中,尝试用Curl运行基本的http请求。
第二个教训是什么都不假设:-)