如何将此字符串拆分为单个字符？

Question

在BeautifulSoup对象中使用此html代码段...

<span class="Example1" data-test-selector="RC1">
 507
        <b>
         3
        </b>
        <b>
         3
        </b>
        <b>
         2
        </b>
</span>

我正在使用此代码将其拆分......

hList = []
for each in soup.find_all('span', {'class': 'Example1'}):
    hList.append(each.text.split())

print(hList)

我得到了结果......

['507', '3', '3', '2']

当我真的想......

['5', '0', '7', '3', '3', '2']

我试图使用各种列表推导，嵌套方法等来分离'507'。我只是想不出这个。

Answer 1

注意：你可能获得[['507', '3', '3', '2']]而不是['507', '3', '3', '2']的结果，因为findall只找到一个元素，然后你将它拆分并附加它。

使用each.text.split()，您将获得一个字符串列表。字符串是可迭代的字符串（1个字符的字符串，是字符串的字符）。通过使用.extend(..)代替并展平each.text.split()的结果，我们可以将每个字符分别添加到列表中：

hList = []
for each in soup.find_all('span', {'class': 'Example1'}):
    hList.extend([c for cs in each.text.split() for c in cs])

print(hList)

或者我们将其转换为完整的列表理解：

hList = [c for each in soup.find_all('span', {'class': 'Example1'})
           for cs in each.text.split()
           for c in cs]

print(hList)

Answer 2

将列表中的字符串加入单个字符串，然后在该字符串上调用list()：

>>> hList = ['507', '3', '3', '2']
>>> list(''.join(hList))
['5', '0', '7', '3', '3', '2']

您的代码实际上构造了一个列表列表，因此您需要在应用str.join()之前展平列表。这可以通过列表理解来创建hList：

>>> hList = [s for each in soup.find_all('span', {'class': 'Example1'})
                for s in each.text.split()]
>>> list(''.join(hList))
['5', '0', '7', '3', '3', '2']

Answer 3

另一种方式可能如下所示：

from bs4 import BeautifulSoup

content='''
<span class="Example1" data-test-selector="RC1">
    507
    <b>
     3
    </b>
    <b>
     3
    </b>
    <b>
     2
    </b>
</span>
'''
soup = BeautifulSoup(content,'lxml')
for items in soup.select('.Example1'):
    data = ' '.join([item for item in items.text])
    print(data.split())

输出：

['5', '0', '7', '3', '3', '2']

如何将此字符串拆分为单个字符？

问题描述投票：3回答：3

3个回答

最新问题

如何将此字符串拆分为单个字符？

问题描述 投票：3回答：3

3个回答

最新问题

问题描述投票：3回答：3