我在下面有input.htmlI
Input.htmlhttps://jsfiddle.net/f86q7ubm/
并且我试图匹配大小为5的列表allList
中的所有元素,但是当我运行以下代码时,匹配中没有任何值。
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in s]
我做错了什么?
可能(应该)是一种更好的方法,但是您可以执行str(s)
。您正在尝试在非字符串对象中进行匹配:
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in str(s)]
不确定这是否是您想要的,但是更好的方法可能是:
allList = soup.find_all("font", {"size": "5"}) # you already have the matching elements here
soup = BeautifulSoup(html, 'html.parser')
for item in soup.findAll("font", {'size': 5}):
print(item.text)
输出:
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT