使用多个find_all

问题描述 投票:-1回答:1

我刚刚开始学习Python,我需要从https://www.congress.gov/bill/112th-congress中删除数百张国会议案。例如,我需要到达下面的H.R.6729。要访问文本的HTML页面的结构是:

条例草案1. H.R.6729 - 第112届国会(2011-2012)

因此它隐藏在“li”中,然后隐藏在“span”中。在网页中重复100个国会账单。

我写的代码是:

import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.congress.gov/bill/112th-congress', headers = {'User-agent': 'Chrome'})
soup = BeautifulSoup(res.text, 'html.parser')
bills = soup.find_all("li", {"class" : "expanded"})
len(bills) # this is 100 as there are 100 bills in the page
for bill in bills:
    bill_number = bill.find_all("span", {"class":"result-heading"})
len(bills) # this is giving me 1

我认为问题在于第二个find_all,为什么输出只是1个元素?

python beautifulsoup findall
1个回答
0
投票

你必须改造

bill_number = bill.find_all("span", {"class":"result-heading"})

bill_number += bill.find_all("span", {"class":"result-heading"})
© www.soinside.com 2019 - 2024. All rights reserved.