TypeError: set_user_agent() takes 2 positional arguments but 3 were given How i can set 3 aguments to the method

问题描述 投票:0回答:0

我看到这个答案:TypeError: set_user_agent() takes 2 positional arguments but 3 were given for my problem 但我不明白如何在我的代码中使用这个答案。

导入刮擦 从 scrapy.linkextractors 导入 LinkExtractor 从 scrapy.spiders 导入 CrawlSpider,Rule

BestMoviesSpider 类(CrawlSpider): name = 'best_movies' allowed_domains = ['imdb.com']

user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)..'

def start_requests(self):
    yield scrapy.Request(url='https://www.imdb.com/search/title/?groups=top_250&sort=user_rating', headers={
        'User-Agent': self.user_agent
    })

rules = (
    Rule(LinkExtractor(restrict_xpaths="//h3[@class='lister-item-header']/a"), callback='parse_item', follow=True, process_request='set_user_agent'),
    Rule(LinkExtractor(restrict_xpaths="(//a[@class='lister-page-next next-page'])[2]"), process_request='set_user_agent')
)

def set_user_agent(self, request):
    request.headers['User-Agent'] = self.user_agent
    return request

def parse_item(self, response):
    yield {
        'title': response.xpath("//div[@class='title_wrapper']/h1/text()").get(),
        'year': response.xpath("//span[@id='titleYear']/a/text()").get(),
        'duration': response.xpath("normalize-space((//time)[1]/text())").get(),
        'genre': response.xpath("//div[@class='subtext']/a[1]/text()").get(),
        'rating': response.xpath("//span[@itemprop='ratingValue']/text()").get(),
        'movie_url': response.url,
        'user_agent': response.request.headers['User_Agent']
    }

错误:set_user_agent() 接受 2 个位置参数,但给出了 3 个

python web-crawler
© www.soinside.com 2019 - 2024. All rights reserved.