Scrapy 从表中收集数据

问题描述 投票:0回答:1

我没有从下面的脚本中收到错误,但该脚本不返回任何数据。我试图获取 html 表 4 中开始的每周的所有比赛。当我在 scrapy shell 中输入 xpath 命令时,我会获取数据,但一旦输入解析定义,我就不会得到任何回报。

import scrapy


class NFLOddsSpider(scrapy.Spider):
    name = 'NFLOdds'
    allowed_domains = ['www.sportsoddshistory.com']
    start_urls = ['sportsoddshistory.com/nfl-game-odds']

    def parse(self, response):
        
        for row in response.xpath('//table[@class="soh1"]//tbody/tr'):

            day = row.xpath('td[1]//text()').extract_first()
            date = row.xpath('td[2]//text()').extract_first()
            time = row.xpath('td[3]//text()').extract_first()
            AtFav = row.xpath('td[4]//text()').extract_first()
            favorite = row.xpath('td[5]//text()').extract_first()
            score = row.xpath('td[6]//text()').extract_first()
            spread = row.xpath('td[7]//text()').extract_first()
            AtDog = row.xpath('td[8]//text()').extract_first()
            underdog = row.xpath('td[9]//text()').extract_first()
            OvUn = row.xpath('td[10]//text()').extract_first()
            notes = row.xpath('td[11]//text()').extract_first()
            week = row.xpath('//*[@id="content"]/div/table[4]/tbody/tr/td/h3').extract_first()

            oddsTable = {
                'day': day,
                'date': date,
                'time': time,
                'AtFav': AtFav,
                'favorite': favorite,
                'score': score,
                'spread': spread,
                'AtDog': AtDog,
                'underdog': underdog,
                'OvUn': OvUn,
                'notes': notes,
                'week' : week
            }
            yield oddsTable
python scrapy
1个回答
0
投票

代码只需进行一次修改即可为我工作 - 在

https://
中为 URL 添加
start_urls

start_urls = ['https://sportsoddshistory.com/nfl-game-odds']

保存到文件(NFLOddsSpider.py)并执行:

scrapy runspider NFLOddsSpider.py -O output.csv

附上抓取数据的截图:

© www.soinside.com 2019 - 2024. All rights reserved.