加速使用多线程python3

问题描述 投票:1回答:1

事实上,我正在创建一个代理检查程序,但问题是它需要大量的时间来检查,因为有很多的代理。

def proxy():
    lives = []
    allproxy = []

    def fetch_proxy():
        raw_proxy = []
        res = requests.get(proxy_api)
        raw_proxy = res.text.splitlines()
        return raw_proxy

    allproxy = fetch_proxy()

    for proxy in allproxy:
        try:

            proxyDictChk = { 
                          "https"  : "https://"+proxy, 
                          "http" : "http://"+proxy,
                        }
            res = requests.get("http://httpbin.org/ip",proxies=proxyDictChk,timeout=3)
            print("Proxy is Working")
            lives.append(proxy)
        except Exception as e:
            print("Proxy Dead")
    return lives

print(proxy())

我很好奇,如何在这里使用多线程使其快速发展。

PS. 先谢谢你

python-3.x multithreading python-requests python-multithreading
1个回答
0
投票

python文档中提供了一个很好的例子。https:/docs.python.org3libraryconcurrent.futures.html。

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(check_proxy, url, 60): url for url in allproxy}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            is_valid = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%s page is %s' % (url, is_valid))

所以你只需要定义函数check_proxy。

def check_proxy( proxy ):
    try:
        proxyDictChk = { 
                      "https"  : "https://"+proxy, 
                      "http" : "http://"+proxy,
                    }
        res = requests.get("http://httpbin.org/ip",proxies=proxyDictChk,timeout=3)
        print("Proxy is Working")
        return True
    except Exception as e:
        print("Proxies Dead!")
        return False

本质上,使用一个执行器并提交一个做你想要的函数。然后在函数完成时,使用未来来获取函数的结果。

另外,因为这样可以让异常冒出来,所以你不必在函数中处理它。

def check_proxy( proxy ):
    proxyDictChk = { "https"  : "https://"+proxy, 
                      "http" : "http://"+proxy,
                    }
    res = requests.get("http://httpbin.org/ip",proxies=proxyDictChk,timeout=3)
    return True

现在可以在未来状态下处理异常了。你可以把返回类型改成更有意义的东西。

© www.soinside.com 2019 - 2024. All rights reserved.