我是一名研究 scrapy 框架的学生,试图抓取 linkedin 配置文件连接,但我被阻止了,我已经集成了 zyte smarrtproxy 并收到 523 错误。请帮我绕过这个
我怎样才能抓取 linkedin 个人资料连接数据?
我的代码:
import scrapy
from linkedinprofile.loginlinkedin import loginSitesHandler
from scrapy_splash import SplashRequest
from scrapy.http import FormRequest
class profile_connectionsSpider(scrapy.Spider):
name = "profile_connections"
def start_requests(self):
profile_list = [
'https://www.linkedin.com/home',
'https://www.linkedin.com/in/darsh-turakhia-011000195/'
]
for profile in profile_list:
yield scrapy.Request(url=profile, callback=self.parse)
def parse(self, response):
with open('response.html', 'wb') as f:
f.write(response.body)
print(response.xpath('//*[@id="ember255"]/div[2]/div[2]/div[1]/div[1]/h1').get())