如何匹配BeautifulSoup列表中包含字符串的元素?

问题描述 投票:0回答:2

我在下面有input.htmlI

Input.htmlhttps://jsfiddle.net/f86q7ubm/

并且我试图匹配大小为5的列表allList中的所有元素,但是当我运行以下代码时,匹配中没有任何值。

from bs4 import BeautifulSoup

fp = open("file.html", "rb")                 
soup = BeautifulSoup(fp,"html5lib")

allList = soup.find_all(True)

matching = [s for s in allList if 'size="5"' in s]  

我做错了什么?

python beautifulsoup html-parsing
2个回答
1
投票

可能(应该)是一种更好的方法,但是您可以执行str(s)。您正在尝试在非字符串对象中进行匹配:

from bs4 import BeautifulSoup

fp = open("file.html", "rb")                 
soup = BeautifulSoup(fp,"html5lib")

allList = soup.find_all(True)

matching = [s for s in allList if 'size="5"' in str(s)] 

不确定这是否是您想要的,但是更好的方法可能是:

allList = soup.find_all("font", {"size": "5"}) # you already have the matching elements here

1
投票
soup = BeautifulSoup(html, 'html.parser')

for item in soup.findAll("font", {'size': 5}):
    print(item.text)

输出:

TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
© www.soinside.com 2019 - 2024. All rights reserved.