BeautifulSoup 抓取具有相同类名的标签

问题描述 投票:0回答:2
python web-scraping beautifulsoup
2个回答
0
投票

您正在循环内的

soup
中搜索。将其更改为
item.select_one

from bs4 import BeautifulSoup


html_doc = """
<article>
<p class='metadata'>Wed 1 Jan 2020 00:01 GMT</p>
<p class='metadata'>Category: <span>UK-News</span></p>
</article>

<article>
<p class='metadata'>Wed 2 Jan 2020 00:01 GMT</p>
<p class='metadata'>Category: <span>World-News</span></p>
</article>"""

soup = BeautifulSoup(html_doc, "html.parser")

articles = soup.find_all("article")
for item in articles:
    category = item.select_one("p.metadata span").text  # <-- use item.select
    print(category)

打印:

UK-News
World-News

0
投票

试试这个: 对于文章中的项目: case1 = tag.select("div.ID").text case2 = tag.select("div.Id").next_sibling.text 打印(案例1,案例2)

© www.soinside.com 2019 - 2024. All rights reserved.