如何在 asyncio 中使用请求？

Question

我想在

asyncio

中执行并行http请求任务，但我发现

python-requests

会阻塞

asyncio

的事件循环。我找到了aiohttp，但它无法提供使用http代理的http请求服务。

所以我想知道是否有办法在

asyncio

的帮助下进行异步http请求。

Answer 1

要将 requests（或任何其他阻塞库）与 asyncio 一起使用，您可以使用 BaseEventLoop.run_in_executor 在另一个线程中运行函数并从中产生结果以获取结果。例如：

import asyncio
import requests

@asyncio.coroutine
def main():
    loop = asyncio.get_event_loop()
    future1 = loop.run_in_executor(None, requests.get, 'http://www.google.com')
    future2 = loop.run_in_executor(None, requests.get, 'http://www.google.co.uk')
    response1 = yield from future1
    response2 = yield from future2
    print(response1.text)
    print(response2.text)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

这将同时获得两个响应。

使用 python 3.5，您可以使用新的

await

/

async

语法：

import asyncio
import requests

async def main():
    loop = asyncio.get_event_loop()
    future1 = loop.run_in_executor(None, requests.get, 'http://www.google.com')
    future2 = loop.run_in_executor(None, requests.get, 'http://www.google.co.uk')
    response1 = await future1
    response2 = await future2
    print(response1.text)
    print(response2.text)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

请参阅PEP0492了解更多信息。

Answer 2

aiohttp 已经可以与 HTTP 代理一起使用:

import asyncio
import aiohttp


@asyncio.coroutine
def do_request():
    proxy_url = 'http://localhost:8118'  # your proxy address
    response = yield from aiohttp.request(
        'GET', 'http://google.com',
        proxy=proxy_url,
    )
    return response

loop = asyncio.get_event_loop()
loop.run_until_complete(do_request())

Answer 3

上面的答案仍然使用旧的Python 3.4风格的协程。如果你有 Python 3.5+，你会写下面的内容。

aiohttp

现在支持 http代理

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    urls = [
            'http://python.org',
            'https://google.com',
            'http://yifei.me'
        ]
    tasks = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        htmls = await asyncio.gather(*tasks)
        for html in htmls:
            print(html[:100])

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

还有

httpx

库，它是具有

async/await

支持的请求的直接替代品。然而，httpx 比 aiohttp 慢一些。

另一个选项是

curl_cffi

，它能够模拟浏览器的ja3和http2指纹。

Answer 4

Requests 目前不支持

asyncio

并且没有计划提供此类支持。您很可能可以实现一个自定义的“传输适配器”（如此处所讨论），它知道如何使用

asyncio

。

如果我发现自己有时间的话，我可能会真正研究一下，但我不能承诺任何事情。

Answer 5

Pimin Konstantin Kefaloukos 的一篇文章中有一个很好的异步/等待循环和线程案例使用 Python 和 asyncio 轻松并行 HTTP 请求:

为了最小化总完成时间，我们可以增加线程池的大小以匹配我们必须发出的请求数量。幸运的是，这很容易做到，我们接下来会看到。下面列出的代码是如何使用包含 20 个工作线程的线程池发出 20 个异步 HTTP 请求的示例：

# Example 3: asynchronous requests with larger thread pool
import asyncio
import concurrent.futures
import requests

async def main():

    with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:

        loop = asyncio.get_event_loop()
        futures = [
            loop.run_in_executor(
                executor, 
                requests.get, 
                'http://example.org/'
            )
            for i in range(20)
        ]
        for response in await asyncio.gather(*futures):
            pass


loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Answer 6

考虑到 aiohttp 是功能齐全的 Web 框架，我建议使用更轻量级的框架，例如支持异步请求的 httpx (https://www.python-httpx.org/)。它具有与请求几乎相同的 api：

>>> async with httpx.AsyncClient() as client:
...     r = await client.get('https://www.example.com/')
...
>>> r
<Response [200 OK]>

Answer 7

免责声明：

Following code creates different threads for each function.

这对于某些情况可能很有用，因为它使用起来更简单。但要知道它不是异步的，而是使用多线程给人一种异步的错觉，尽管装饰器建议这样做。

要使任何函数成为非阻塞，只需复制装饰器并使用回调函数作为参数来装饰任何函数。回调函数将接收函数返回的数据。

import asyncio
import requests


def run_async(callback):
    def inner(func):
        def wrapper(*args, **kwargs):
            def __exec():
                out = func(*args, **kwargs)
                callback(out)
                return out

            return asyncio.get_event_loop().run_in_executor(None, __exec)

        return wrapper

    return inner


def _callback(*args):
    print(args)


# Must provide a callback function, callback func will be executed after the func completes execution !!
@run_async(_callback)
def get(url):
    return requests.get(url)


get("https://google.com")
print("Non blocking code ran !!")

Answer 8

python-requests

本身还不支持 asyncio。使用原生支持 asyncio 的库（例如 httpx）将是最有益的方法。

但是，如果您的用例严重依赖于使用

python-requests

，您可以使用

asyncio.to_thread

和

asyncio.gather

包装同步调用，并遵循异步编程模式。

import asyncio
import requests

async def main():
    res = await asyncio.gather(asyncio.to_thread(requests.get("YOUR_URL"),)

if __name__ == "__main__":
    aysncio.run(main())

对于网络请求的并发/并行化：

import asyncio
import requests

urls = ["URL_1", "URL_2"]

async def make_request(url: string):
    response = await asyncio.gather(asyncio.to_thread(requests.get(url),)
    return response

async def main():
    responses = await asyncio.gather((make_request(url) for url in urls))
    for response in responses:
        print(response)

if __name__ == "__main__":
    asyncio.run(main())

如何在 asyncio 中使用请求？

问题描述投票：0回答：8

8个回答

最新问题

如何在 asyncio 中使用请求？

问题描述 投票：0回答：8

8个回答

最新问题

问题描述投票：0回答：8