Web Scraping,脚本返回AttributeError:“ NoneType”

问题描述 投票:0回答:1

嘿StackOverFlow社区!

我正在尝试构建代码以从网站中抓取财务数据并将其登录到excel文件中。为此,我需要先学习抓取,并且正在使用FreeCodeCamp的抓取教程。

我收到的问题是NoneType错误。

以下是我的代码:

# import libraries
import urllib2
from bs4 import BeautifulSoup

# specify the URL
quote_page = "https://www.bloomberg.com/quote/RY:CN"

# query the website and return the html to the variable "page"
page = urllib2.urlopen(quote_page)

# parse html using beautiful soup and store in variable "soup"
soup = BeautifulSoup(page, "html.parser")

# take out the <div> of name and get its value
name_box = soup.find("h1", attrs={"class": "companyName_99a4824b"})

# strip is used to remove starting and trailing
name = name_box.text.strip() 
print name

显示的错误如下:

Traceback (most recent call last):
  File "rbc.py", line 15, in <module>
    name_box = soup.find("h1", attrs={"class": "companyName_99a4824b"}).text
AttributeError: 'NoneType' object has no attribute 'text'

我相信问题是打开链接时python代码找不到“ companyName_99a4824b”,但我已经确认“ companyName_99a4824b”是HTML脚本中的关联变量名。

谢谢您的帮助!

P.S。 “ companyName_99a4824b”是与公司名称关联的唯一名称,可以在检查页面时找到。该页面的链接为https://www.bloomberg.com/quote/RY:CN

python screen-scraping nonetype
1个回答
0
投票

此行:

name_box = soup.find("h1", attrs={"class": "companyName_99a4824b"})

找不到匹配项,因此返回None。这正在发生:

None.text.strip()

因此,'NoneType' object has no attribute 'text'错误。修改您的代码,以使find实际上finds您要查找的内容,显然不存在。]

© www.soinside.com 2019 - 2024. All rights reserved.