我无法从H1删除跨度-我只需要H1文本,而没有跨度内的文本:
page = requests.get("https://www.bbc.co.uk/weather/524901")
soup = BeautifulSoup(page.content, 'html.parser')
weather_desc_today = soup.find(class_="wr-day__weather-type-description").get_text()
weather_location = soup.find('h1').text
print (weather_location)
输出:
Moscow - Weather warnings issued
[当我只需要'莫斯科'时
这里是HTML:
<h1 id="wr-location-name-id" tabindex="-1" class="wr-c-location__name gel-paragon">Moscow<span class="gs-u-vh wr-c-warnings-issued"> - <!-- -->Weather warnings issued</span></h1>
您可以使用find
。
weather_location = soup.find('h1').find(text=True)
OUT: Moscow