System.Net.Http.HttpRequestException从Azure Datalake V2下载多个文件

问题描述 投票:1回答:1

我正在从Azure Datalake V2下载大量文件> 1000,并且不断出现异常:

The SSL connection could not be established, see inner exception. 
<--- Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.. 
<--- An existing connection was forcibly closed by the remote host.

Stacktrace:

System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
   at System.Net.FixedSizeReader.ReadPacketAsync(Stream transport, AsyncProtocolRequest request)
   at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)

代码:

var downloadTasks = job.Files.AsParallel().Select(x => Download(x));
await Task.WhenAll(downloadTasks);

private async Task Download(DownloadableFile file)
{
    try
    {
        var options = new BlobRequestOptions
        {
            ParallelOperationThreadCount = 8,
            DisableContentMD5Validation = true,
            StoreBlobContentMD5 = false
        };
        var xzBlob = await _cloudBlobFileService.GetBlockBlobReference(file.FilePath);
        await xzBlob.DownloadToFileAsync(file.LocalFilePath, FileMode.Create, null, options, null);
    }
    catch (Exception e)
    {
         _log.LogCritical(e, "Error downloading " + file.FilePath);
    }
}

我也添加了此:

ServicePointManager.DefaultConnectionLimit = Environment.ProcessorCount * 8;
ServicePointManager.Expect100Continue = false;

使用.Net core 3.1和WindowsAzure.Storage 9.3.3

到program.cs webjob中的主要方法

我们曾经有一个不带datalake的blobstorage配置,但是切换到datalake之后就出现了。它不会对应用程序造成太大影响,因为稍后会重试跳过的下载。但是,很高兴知道是什么原因引起的。

azure .net-core azure-data-lake
1个回答
2
投票

[您可以先尝试在11月达到GA的new storage SDK,尽管我不能保证这可以解决问题。完全重写

虽然无法仅从错误消息中准确指出,但有几件事情要看:

  1. 网络错误。尽管很有趣,它可以与您的旧Blob存储帐户保持一致,但这是迄今为止最可能的原因。增加超时时间可能会降低网络错误的发生率,重试逻辑将有助于克服这些错误。
  2. Using unbounded parallelism不推荐。 ParallelOperationThreadCount用于uploads not downloads,因此在这种情况下不会限制请求。 server-side connections in .NET is 10的默认限制,在使用.NET Core时建议设置为increase this。如果您同时访问同一Blob或分区太多次,则可以开始进入Storage中的concurrent connections limits
© www.soinside.com 2019 - 2024. All rights reserved.