Python BeautifulSoup4 查找属性

Question

现在我正在构建一个网络抓取工具来抓取标签后的实际 href 链接，然后继续并使用我抓取的所有值创建一个文件。

我只想获取“/groups/1234123”属性值和ID名称（“InsertNameHere”），但没有任何效果。

from bs4 import BeautifulSoup

htmltext = ''' <div class="sidenav">
         <div class="sidenav-head" id="InsertNameHere">   
          <a href="/groups/1234123/">
           InsertNameHere
          </a>
         </div>
        </div>'''
 
soup = BeautifulSoup(htmltext, 'html.parser')

s = soup.find_all('a')
link= s.find('href')

print(link)

我得到了

“AttributeError：ResultSet 对象没有属性‘find’。您可能将元素列表视为单个元素。当您打算调用 find() 时，您是否调用了 find_all()？”

我尝试过改变

link = s.find('href')

到

link = s.attrs

它说我之后有一个不同的属性错误。我还需要保留

s.find_all()

属性，因为我需要获取多个 id。

Answer 1

如错误消息所示，您应该使用

s = soup.find('a')

而不是

find_all

，它返回一个 ResultSet （类似于标签列表）。另外，您需要使用

.get('href')

甚至简单地使用

['href']

，而不是

.find

；请参阅文档了解更多详细信息。

Python BeautifulSoup4 查找属性

问题描述投票：0回答：1

1个回答

最新问题

Python BeautifulSoup4 查找属性

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1