使用beautifulsoup遍历列表

问题描述 投票:0回答:1

我有以下代码:

players = ['a','b','c',etc]    
list = []
for player in players:
    html = 'https://hoopshype.com/player/'+player+'/salary/'
    webpage = requests.get(html)
    content = webpage.content
    soup = BeautifulSoup(content,"html.parser")
    table = soup.find('table',{'class':'player-payroll-1'})
    for row in table.find_all('tr'):
        for item in row.find_all('td',{'class':'table-value'}):
            a = item.text
            c = a.replace("\n","").replace("\t","")
            b.append(c)

我正在尝试遍历大量玩家。现在,我知道我的代码是正确的,因为我已经与一些成功的玩家进行了专门的检查。

但是当我尝试遍历整个列表时,for循环停止,并且出现错误:'NoneType'对象没有属性'find_all'

我正在寻找如何执行for循环至:

a)确切找出列表中的哪些项目导致了错误,b)尽管有错误,仍继续遍历列表

有没有办法做到这一点?

beautifulsoup
1个回答
2
投票

要解决错误,可以使用tryexcept。但是似乎不必为了获得初始表中的薪水而遍历每个球员页面。您需要致电EACH播放器页面的任何特殊原因吗?

这里是整张桌子:

import pandas as pd

df = pd.read_html('https://hoopshype.com/salaries/players/')[0]

第一十行的输出:

print (df.head(10).to_string())
   Unnamed: 0             Player      2019/20      2020/21      2021/22      2022/23 2023/24 2024/25
0         1.0      Stephen Curry  $40,231,758  $43,006,362  $45,780,966           $0      $0      $0
1         2.0  Russell Westbrook  $38,506,482  $41,358,814  $44,211,146  $47,063,478      $0      $0
2         2.0         Chris Paul  $38,506,482  $41,358,814  $44,211,146           $0      $0      $0
3         4.0       James Harden  $38,199,000  $41,254,920  $44,310,840  $47,366,760      $0      $0
4         4.0          John Wall  $38,199,000  $41,254,920  $44,310,840  $47,366,760      $0      $0
5         6.0       LeBron James  $37,436,858  $39,219,566  $41,002,274           $0      $0      $0
6         7.0       Kevin Durant  $37,199,000  $39,058,950  $40,918,900  $42,778,850      $0      $0
7         8.0      Blake Griffin  $34,449,964  $36,810,996  $38,957,028           $0      $0      $0
8         9.0         Kyle Lowry  $33,296,296  $30,500,000           $0           $0      $0      $0
9        10.0        Paul George  $33,005,556  $35,450,412  $37,895,268           $0      $0      $0
© www.soinside.com 2019 - 2024. All rights reserved.