Scrapy在使用xpath选择器时不提供任何输出。

问题描述 投票:0回答:1

这是我试图在scrapy shell中运行的代码,以便从dailymail.co.uk获取文章的标题。

headline = response.xpath("//div[@id='js-article-text']/h2/text()").extract()

$ scrapy shell "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
xpath scrapy web-crawler
1个回答
1
投票

设置一个用户代理与你的请求,它应该工作。

scrapy shell -s USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0" "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
response.xpath("//div[@id='js-article-text']/h2/text()").extract()

輸出 :

Shia LaBeouf reveals his heavily tattoo torso as he goes shirtless for a run in hot pink shorts
© www.soinside.com 2019 - 2024. All rights reserved.