如何从网络抓取功能（Beautiful Soup）中删除某些信息：

问题描述投票：1回答：1

我正在使用BeautifulSoup从此网站https://lawyers.justia.com/lawyer/michael-paul-ehline-85006抓取>

我不希望在输出中显示赞助商清单：

我的代码：

for o in soup.findAll('div', attrs={"class": "block-wrapper"}): 
    for de in o.findAll("li"):
        if de != []:
            de=remove_tags(str(de))
            print (de)
python输出：OUTPUT IMAGE

我正在使用BeautifulSoup从此网站上抓取https://lawyers.justia.com/lawyer/michael-paul-ehline-85006我不希望在我的输出中显示赞助商清单：我的代码：用于汤中的o。 ...

python web beautifulsoup screen-scraping

1个回答

0
投票

您可以删除HTML页面中的内容。使用findAll（'div'，attrs = {“ class”：“ primary-sidebar-wrapper”}）找到所需的元素后。您可以执行以下操作：

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.