加速foreach循环中的WebClient调用

问题描述 投票:0回答:1

我正在开发一个asp.net mvc-5 web应用程序,我有以下调用来执行第三方应用程序的连续WebClient()调用:

public async Task<List<Technology>> GetResource(int? filtertype)
{

  try
  {
     using (WebClient wc = new WebClient()) 
     {
         string url = currentURL + "resources?AUTHTOKEN=" + token;
         var json = await wc.DownloadStringTaskAsync(url);
         resourcesinfo = JsonConvert.DeserializeObject<ResourcesInfo>(json);
     }

     //for each resource get its tag + add the tag to the list
     foreach (var c in resourcesinfo.operation.Details)
     {    
        ResourceAccountListInfo resourceAccountListInfo = new ResourceAccountListInfo();
        using (WebClient wc = new WebClient()) 
        {    
        string url = currentURL + "resources/" + c.RESOURCEID + "?AUTHTOKEN=" + token;
        string tempurl = url.Trim();    
        var json = await wc.DownloadStringTaskAsync(tempurl);
        resourceAccountListInfo = JsonConvert.DeserializeObject<ResourceAccountListInfo>(json);     
                       AllTags.Add(resourceAccountListInfo.SingleOrDefault().CUSTOMFIELDVALUE.ToLower());   
     }    
   }
}

目前第一个WebClient将返回大约1,500条记录,因此我在WebClient内的第二次foreach调用将被执行1500次,因此整个过程大约需要20分钟才能完成。那么我该如何改进这个过程呢?

c# asp.net parallel-processing task-parallel-library
1个回答
2
投票

你需要一些方法来限制对DownloadStringTaskAsync的调用。您可以使用信号量和Task.Run手动执行此操作,也可以使用TPL Dataflow库提供所有URL并指定所需的并行度。数据流块将接受异步委托(与Parallel.For不同)

private static async Task<Thing[]> ProcessAllUrls(string[] urls)
{
    var workBlock = new TransformBlock<string, Thing>(
        async url => await DownloadAndProcessUrl(url),
        new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 20 }
        );

    var outputBlock = new BufferBlock<Thing>();

    using (workBlock.LinkTo(outputBlock, new DataflowLinkOptions { PropagateCompletion = true }))
    {

        foreach (var url in urls)
        {
            workBlock.Post(url);
        }

        // signal no more input going into workblock
        workBlock.Complete();

        // wait for workblock to pump all data into outputblock
        await workBlock.Completion;

        IList<Thing> finalResult = null;
        bool result = outputBlock.TryReceiveAll(out finalResult);
        return finalResult.ToArray();
    }
}

您确实要小心在Web服务器进程中执行并行操作。虽然WebClient调用与CPU真正异步,但您对反应进行反序列化的工作将在线程池线程上运行,这意味着它在此期间与ASP.NET资源的CPU请求竞争

© www.soinside.com 2019 - 2024. All rights reserved.