无法使用aiohttp在facebook上抓取好友,但使用requests时它可以工作

问题描述 投票:0回答:1

我正在尝试获取朋友列表,但这里当我使用 print(Dump) 检查时它甚至不起作用,但它只显示为空,这里我使用 aiohttp 但当我使用 requests 时它起作用

这是aiohttp的代码

import aiohttp
import asyncio
import re

async def publik(userid, cookie, unit_cursor):
    headers = {
        'upgrade-insecure-requests': '1',
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/ apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'host': 'm.facebook.com',
        'user-agent': 'Mozilla/5.0 (Linux; Android 5.0; SM-G900P Build/LRX21T; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/43.0.2357.121 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/35.0.0.48.273;]',
        'accept-language': 'id,en;q=0.9',
    }
    async with aiohttp.ClientSession(headers=headers, cookies={'cookie': cookie}) as session:
        async with session.get(f'https://m.facebook.com/profile.php?id={userid}&v=friends&unit_cursor={unit_cursor}') as response:
            text = await response.text()
            all_friends = re.findall('href="fb://profile/(.*?)">(.*?)<', text)
            for id_friends, name in all_friends:
                name = name.lower()
                if not (0 < len(name) <= 100) or str(id_friends) in str(Dump):
                    continue
                print(f'[$] dump user {id_friends}/{len(Dump)}', end='\r')
                await asyncio.sleep(0.0007)
                Dump.append(f'{id_friends}|{name}')
            if 'Sorry, something went wrong.' in text:
                await asyncio.sleep(2.1)
                return 0
            elif 'unit_cursor=' in text:
                try:
                    unit_cursor = re.search('unit_cursor=(.*?)&', text).group(1)
                    return await publik(userid, cookie, unit_cursor)
                except AttributeError:
                    await asyncio.sleep(2.1)
                    return 2
            else:
                return 0

# Inisialisasi list Dump sebelum menjalankan fungsi publik
Dump = []

userid = '100020958408145'
cookie = 'sb=K8DhZFiK6wPGlgzcw8LVu6F6; wd=980x1840; datr=K8DhZNDfRXcfahLHO1ru8Lu9; c_user=61550081050717; xs=49%3AW1FdPhrVqxwO7Q%3A2%3A1692517861%3A-1%3A-1; presence=C%7B%22t3%22%3A%5B%5D%2C%22utc3%22%3A1692517909130%2C%22v%22%3A1%7D; dpr=2.75; fr=0oz0sLXg63zA3RRJJ.AWUGBbqzDbkk5njUvRxqDnEOVFo.Bk4cAr.Be.AAA.0.0.Bk4cXz.AWU_vlrVyHM'

# Menjalankan fungsi publik dengan parameter yang sesuai
asyncio.run(publik(userid, cookie, unit_cursor=''))
print(Dump)

这是请求的代码

def publik(userid, cookie, unit_cursor):
    with requests.Session() as sr:
        headers = {
            'upgrade-insecure-requests': '1',
            'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/ apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            'host': 'm.facebook.com',
            'user-agent': 'Mozilla/5.0 (Linux; Android 5.0; SM-G900P Build/LRX21T; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/43.0.2357.121 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/35.0.0.48.273;]',
            'accept-language': 'id,en;q=0.9',
        }
        sr.headers.update(headers)
        sr.cookies.update({'cookie': cookie})
        response = sr.get(f'https://m.facebook.com/profile.php?id={userid}&v=friends&unit_cursor={unit_cursor}').text
        all_friends = re.findall('href="fb://profile/(.*?)">(.*?)<', response)
        for id_friends, name in all_friends:
            name = name.lower()
            if not (0 < len(name) <= 100) or str(id_friends) in str(Dump):
                continue
            print(f'[$] dump user {id_friends}/{len(Dump)}', end='\r')
            time.sleep(0.0007)
            Dump.append(f'{id_friends}|{name}')
        if 'Sorry, something went wrong.' in response:
            time.sleep(2.1)
            return 0
        elif 'unit_cursor=' in response:
            try:
                unit_cursor = re.search('unit_cursor=(.*?)&', response).group(1)
                publik(userid, cookie, unit_cursor)
            except AttributeError:
                time.sleep(2.1)
                return 2
        else:
            return 0

我想要的是aiohttp代码可以获取我的朋友列表,因为当剪贴出现时它只是空的,但是当使用请求时它运行顺利请帮助我,因为我已经很难找到问题所在

python async-await aiohttp
1个回答
0
投票

AioHttp 在传递 cookie 方面比请求更严格。它的作用相当于阻止第三方 cookie https://docs.aiohttp.org/en/stable/client_advanced.html#cookie-safety

Cookie 安全¶

默认情况下,ClientSession 使用严格版本的 aiohttp.CookieJar。 RFC 2109 明确禁止从具有 IP 地址的 URL 接受 cookie 而不是 DNS 名称(例如 http://127.0.0.1:80/cookie)。

这很好,但有时为了测试,我们需要启用对此类的支持 饼干。应该通过将 unsafe=True 传递给 aiohttp.CookieJar 来完成 构造函数:

你有两个选择,你可以自己设置

cookie
标头,aiohttp 会将其发送到 facebook。例如::

async def publik(userid, cookie, unit_cursor):
    headers = {
        'upgrade-insecure-requests': '1',
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/ apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'host': 'm.facebook.com',
        'user-agent': 'Mozilla/5.0 (Linux; Android 5.0; SM-G900P Build/LRX21T; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/43.0.2357.121 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/35.0.0.48.273;]',
        'accept-language': 'id,en;q=0.9',
        'cookie': cookie,
    }
    

或者您可以按照上面链接中的 AioHttp 示例,并使用

unsafe=True
设置您自己的自定义 cookie jar。

© www.soinside.com 2019 - 2024. All rights reserved.