Python请求多线程

Question

我一直试图在两天内构建一个具有多线程功能的刮刀。不知怎的，我仍然无法管理它。起初我尝试了使用线程模块的常规多线程方法，但它并不比使用单个线程快。后来我才知道请求是阻塞的，多线程方法并没有真正起作用。所以我一直在研究并发现了关于grequest和gevent的事情。现在我正在使用gevent运行测试，它仍然不比使用单个线程更快。我的编码错了吗？

这是我班级的相关部分：

import gevent.monkey
from gevent.pool import Pool
import requests

gevent.monkey.patch_all()

class Test:
    def __init__(self):
        self.session = requests.Session()
        self.pool = Pool(20)
        self.urls = [...urls...]

    def fetch(self, url):

        try:
            response = self.session.get(url, headers=self.headers)
        except:
            self.logger.error('Problem: ', id, exc_info=True)

        self.doSomething(response)

    def async(self):
        for url in self.urls:
            self.pool.spawn( self.fetch, url )

        self.pool.join()

test = Test()
test.async()

Answer 1

安装与grequests module一起使用的gevent（requests不是为异步设计的）：

pip install grequests

然后将代码更改为以下内容：

import grequests

class Test:
    def __init__(self):
        self.urls = [
            'http://www.example.com',
            'http://www.google.com', 
            'http://www.yahoo.com',
            'http://www.stackoverflow.com/',
            'http://www.reddit.com/'
        ]

    def exception(self, request, exception):
        print "Problem: {}: {}".format(request.url, exception)

    def async(self):
        results = grequests.map((grequests.get(u) for u in self.urls), exception_handler=self.exception, size=5)
        print results

test = Test()
test.async()

这是officially recommended项目的requests：

阻止还是不阻止？

使用默认传输适配器，请求不提供任何类型的非阻塞IO。 Response.content属性将被阻止，直到整个响应被下载。如果您需要更多粒度，库的流功能（请参阅Streaming Requests）允许您一次检索较小数量的响应。但是，这些调用仍会阻止。

如果您担心使用阻塞IO，那么有很多项目将Requests与Python的异步框架结合起来。两个很好的例子是grequests和requests-futures。

使用此方法可以使用10个URL显着提高性能：使用原始方法0.877s与3.852s。

Python请求多线程

问题描述投票：17回答：1

1个回答

最新问题

Python请求多线程

问题描述 投票：17回答：1

1个回答

最新问题

问题描述投票：17回答：1