使用BeautifulSoup时发生迭代失败

Question

我正在使用BeautifulSoup尝试从网页中提取数据。但是由于某种原因，它无法对季节大于1的项目进行迭代。由于节点在我看来完全相同，因此似乎没有这种行为的原因。

def scrape_show(show):
    source = requests.get(show.url).text
    soup = BeautifulSoup(source, 'lxml')

    # All seasons and episodes
    area = soup.find('div', class_='play_video-area-aside play_video-area-aside--related-videos play_video-area-aside--related-videos--titlepage')
    for article in area:
        if "season" in article.get('id'):
            season = article.h2.a.find('span', class_='play_accordion__section-title-inner').text
            print(season + " -- " + article.get('id'))
            # All content for the given season

            ul = article.find('ul')
            if ul is None:
                print("null!")  # This should not happen
示例输出：

Season 1 -- section-season1-xxxx Season 2 -- section-season2-xxxx null!

https://www.svtplay.se/andra-aket（示例中的网址）

我正在使用BeautifulSoup尝试从网页中提取数据。但是由于某种原因，它无法对季节大于1的项目进行迭代。似乎没有理由因为...

Answer 1

数据并非在所有季节都可用HTML格式提供，仅在季节1可用。但是信息以JSON格式嵌入在页面中。您可以使用re和json模块解析此数据：

使用BeautifulSoup时发生迭代失败

问题描述投票：0回答：1

1个回答

最新问题

使用BeautifulSoup时发生迭代失败

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1