如何获得BeautifulSoup标签的所有直接孩子?

问题描述 投票:2回答:1

如何使用BeautifulSoup(bs4)检索(不递归)所有孩子?

<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>

我想得到这样的块:

block1 : <span>A</span>
block2 : <span><span>B</span></span>
block3 : <span>C</span>

我是这样做的:

for j in soup.find_all(True)[:1]:
            if isinstance(j, NavigableString):
                continue
            if isinstance(j, Tag):
                tags.append(j.name)
                # Get siblings
                for k in j.find_next_siblings():
                    # k is sibling of first element

有更清洁的方法吗?

python-3.x beautifulsoup siblings
1个回答
7
投票

如果要仅选择直接后代,可以将recursive参数设置为False。 您提供的html示例:

from bs4 import BeautifulSoup

html = "<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>"
soup = BeautifulSoup(html, "lxml") 
for j in soup.div.find_all(recursive=False):
    print(j)

<span>A</span>
<span><span>B</span></span>
<span>C</span>
© www.soinside.com 2019 - 2024. All rights reserved.