Scrapy 通过迭代返回相同的值

问题描述 投票:0回答:1

我正在使用 Scrapy 从网站提取信息。 我的目标是使用 Scrapy 获取高尔夫球杆的名称、价格等,并跟踪整个冬季的成本,并在价格下跌时购买我想要的东西。

到目前为止,我已经拉出了俱乐部名称,但同名的次数有 38 次。 (第一页有38个俱乐部。)

我想知道为什么它打印相同的名字而不是下一个名字?我正在使用我在课程中所做的一个示例来完成当前的示例。顶层代码是我课程中的代码,第二组是我的。

 
导入scrapy

class Spiderbook0Spider(scrapy.Spider):
    name = "spiderbook0"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com"]

def parse(self, response):
    books = response.css('article.product_pod') # Get all the books on the first page
    for book in books: #Get a single book
        print(book.css('h3 a::text').get())

-------------- 我的代码--------------------

import scrapy


class WedgepriceSpider(scrapy.Spider):
    name = "wedgeprice"
    allowed_domains = ["golftown.com"]
    start_urls = ["https://golftown.com/en-CA/clubs/wedges/"]
 

def parse(self, response):
    wedges = response.css("div.product-tile-top > div.product-image > a.thumb-link ")
    print("***********************************")
    print("***********************************")
    print(wedges)
    for wedge in wedges:
        print(response.xpath("//*[@class = 'name-link']/@title").get())
    print("***********************************")
    print("***********************************")
python web-scraping scrapy scrape
1个回答
0
投票

这是因为在 for 循环中,您在循环的每次迭代中从 html 文件的根目录执行 xpath 查询。

您想要做的是首先查询某个父元素,该父元素与您尝试打印的子元素重复出现相同的次数,然后在第二个表达式中,您可以使用父元素的相对 XPATH 表达式来获取值并将其打印到终端。

例如:

import scrapy


class WedgepriceSpider(scrapy.Spider):
    name = "wedgeprice"
    allowed_domains = ["golftown.com"]
    start_urls = ["https://golftown.com/en-CA/clubs/wedges/"]


    def parse(self, response):
        print("***********************************")
        print("***********************************")
        for tile in response.css(".product-tile"):
            print(tile.xpath(".//*[@class = 'name-link']/@title").get())
        print("***********************************")
        print("***********************************")

输出

2023-11-19 21:55:23 [scrapy.core.engine] INFO: Spider opened
2023-11-19 21:55:23 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2023-11-19 21:55:23 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2023-11-19 21:55:23 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.golftown.com/en-CA/clubs/wedges/> from <GET https://golftown.com/en-CA/clubs/wedges/>
2023-11-19 21:55:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.golftown.com/en-CA/clubs/wedges/> (referer: None)
***********************************
***********************************
Milled Grind 4 Wedge with Steel Shaft
Glide 4.0 Wedge with Steel Shaft
RTX 4.0 Tour Satin Wedge with Steel Shaft
RTX 6 ZipCore Tour Satin Wedge with Steel Shaft
Milled Grind 3 Black Wedge with Steel Shaft
Milled Grind Wedge with Steel Shaft
JAWS RAW Chrome Wedge with Steel Shafts
Milled Grind 2 Hi-Toe Raw Wedge
RTX 6 ZipCore Black Satin Wedge with Steel Shaft
Mack Daddy Cavity Back Wedge with Steel Shaft
Staff Model Wedge with Steel Shaft
JAWS MD5 Platinum Chrome Wedge with Steel Shaft
Milled Grind 3 Chrome Wedge with Steel Shaft
CBX Full-Face 2 Tour Satin with Steel Shaft
SM9 Brushed Steel Wedge with Steel Shaft
King Cobra Snake Bite Wedge with Steel Shaft
SM9 Tour Chrome Wedge with Steel Shaft
PUR-S Black Wedge with Steel Shaft
JAWS RAW Chrome Wedge with Graphite Shafts
JAWS RAW Black Wedge with Steel Shafts
King Cobra Black Snake Bite Wedge with Steel Shaft
ChipR Wedge with Steel Shaft
T22 Blue Ion Wedge with Steel Shaft
S23 Copper Cobalt Wedge with Steel Shaft
S23 Satin Chrome Wedge with Steel Shaft
Smart Sole 4 S Black Wedge with Graphite Shaft
Smart Sole 4 G Black Wedge with Graphite Shaft
Smart Sole 4 C Black Wedge with Graphite Shaft
Smart Sole 4 S Black Wedge with Steel Shaft
Smart Sole 4 G Black Wedge with Steel Shaft
RTX Full-Face Black Wedge with Steel Shaft
CBX Zipcore Tour Satin Wedge with Graphite Shaft
CBX Zipcore Tour Satin Wedge with Steel Shaft
Women's CBX Zipcore Wedge with Graphite Shaft
Ladies X Act Chipper
***********************************
***********************************
2023-11-19 21:55:25 [scrapy.core.engine] INFO: Closing spider (finished)
2023-11-19 21:55:25 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 724,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 28484,
 'downloader/response_count': 2,
 'downloader/response_status_count/200': 1,
 'downloader/response_status_count/301': 1,
 'elapsed_time_seconds': 2.699416,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2023, 11, 20, 5, 55, 25, 901973),
 'httpcompression/response_bytes': 263357,
 'httpcompression/response_count': 1,
 'log_count/DEBUG': 3,
 'log_count/INFO': 10,
 'response_received_count': 1,
 'scheduler/dequeued': 2,
 'scheduler/dequeued/memory': 2,
 'scheduler/enqueued': 2,
 'scheduler/enqueued/memory': 2,
 'start_time': datetime.datetime(2023, 11, 20, 5, 55, 23, 202557)}
© www.soinside.com 2019 - 2024. All rights reserved.