我如何向不同的网站发送太多请求并获得响应?

问题描述 投票:0回答:1

我如何向不同的站点发送大量请求,我有一个站点数据库(1kk),需要检查它们是否还活着,有条件的话,如果你只是通过grequests(python)软件块(100个请求在 10 个线程中〜128秒)将需要 12.5 天,但对我来说太长了,我确信这可以更快地完成。 你能告诉我在这种情况下我可以使用什么吗?我只是收集有关网站主页的信息。

这是我的代码,我想以某种方式改进它,你有什么建议? 我尝试将每个请求都放入流中,但感觉好像有什么东西阻止了它,我将使用代理,这样我的 IP 就不会因为更多请求而被阻止 帮助谁可以!

def start_parting(urls:list,chunk_num,chunks):
    if len(urls)>0:
        chunk_num+=1
        print(f'Chunck [{Fore.CYAN}{chunk_num}/{chunks}{Style.RESET_ALL}] started! Length: {len(urls)}')
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"}
        rs = [grequests.get(url.split(' ')[0].strip(), headers=headers,timeout=10) for url in urls]
        responses = grequests.map(rs)
        for response in responses:
            if response is None:
                continue
            if response.status_code == 200:
                check_pattern = r'(pat1|pat2)'
                match = re.search(check_pattern, response.text, re.IGNORECASE)
                if match:
                    site = match.group(1)
                    print(f'Site {site}')
        print(f'Chunck [{Fore.LIGHTCYAN_EX}{chunk_num}/{chunks}{Style.RESET_ALL}] ended!')

def test_logins_for_file(file,num_threads = 10,chunk_size=100):
    """Tests login for all credentials found in a given file and logs the results."""
    print('Start check!')
    urls = file.readlines()
    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        parts = [urls[i:i + chunk_size] for i in range(0, len(urls), chunk_size)]
        finals = [executor.submit(start_parting, part , part_num,len(parts)) for part_num,part in enumerate(parts)]
        t = time.time()
        for final in as_completed(finals):
            pass
        print(f'Resultate: {time.time()-t}')
python python-requests multiprocessing grequests
1个回答
0
投票

如果您只是测试所需的端口是否打开怎么办?

尝试在这里阅读:

如何检查网络端口是否开放?

© www.soinside.com 2019 - 2024. All rights reserved.