美丽汤返回“ []”

问题描述 投票:0回答:1

我正在尝试使用以下代码从Bloomberg公司资料网站上提取公司信息:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.bloomberg.com/profile/company/AAPL:US'

source = requests.get(URL)

soup = BeautifulSoup(source.content, 'lxml')

company_name = soup.findAll('h1', class_= 'companyName__9bd88132')

company_description = soup.findAll('div', class_ = 'description__ce057c5c')

print(company_name)
print(company_description)

但是我只能得到两个“ []”。在我对类似问题的回答中,他们说是因为不正确的div被拉出,但我认为情况并非如此。有人知道为什么它不起作用吗?编辑:我已经附上了我要从下面拉的html部分:

<section class="companyProfileOverview__aa874298 up__e13cf193"><section class="info__d075c560"><h1 class="companyName__9bd88132">Apple Inc</h1><div class="description__ce057c5c">Apple Inc. designs, manufactures, and markets personal computers and related personal computing and mobile communication devices along with a variety of related software, services, peripherals, and networking solutions. Apple sells its products worldwide through its online stores, its retail stores, its direct sales force, third-party wholesalers, and resellers.</div></section><section class="currentPriceContainer"><p class="currentPriceLabel__f1524605">CURRENT PRICE</p><div><div class="inlineRow__7728fc34"><span class="tickerText__d2e1ee30">AAPL:US</span><span class="priceText__0feeaba3">343.99</span><span class="currency__bef924de">USD</span></div><span class="triangle__73a7d8b2 up__a3b61807"></span><div class="inlineRow__7728fc34"><span class="priceChange__5e691975">+10.53</span><span class="percentChange__3c14f7c4">+3.16%</span></div><div class="time__245ca7bb "><span>As of 08:00 PM EDT 06/09/2020 </span></div><a class="quoteLink__d3ac120b" href="/quote/AAPL:US">SEE QUOTE</a></div></section><div class="infoTable__96162ad6"><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">SECTOR</h2><div class="infoTableItemValue__e188b0cb">Technology</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">INDUSTRY</h2><div class="infoTableItemValue__e188b0cb">Hardware</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">SUB-INDUSTRY</h2><div class="infoTableItemValue__e188b0cb">Communications Equipment</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">FOUNDED</h2><div class="infoTableItemValue__e188b0cb">01/03/1977</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">ADDRESS</h2><div class="infoTableItemValue__e188b0cb">1 Infinite Loop
Cupertino, CA 95014
United States</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">PHONE</h2><div class="infoTableItemValue__e188b0cb">1-408-996-1010</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">WEBSITE</h2><div class="infoTableItemValue__e188b0cb">www.apple.com</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">NO. OF EMPLOYEES</h2><div class="infoTableItemValue__e188b0cb">100000</div></section></div></section>

我正在尝试提取公司名称(companyName__9bd88132)和公司描述(description__ce057c5c)。最终,我也想了解该行业的信息。

python beautifulsoup python-requests findall
1个回答
0
投票

使用此代码:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.bloomberg.com/profile/company/AAPL:US'
from fake_useragent import UserAgent
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
ua=UserAgent()
hdr = {'User-Agent': ua.random,
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
      'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
      'Accept-Encoding': 'none',
      'Accept-Language': 'en-US,en;q=0.8',
      'Connection': 'keep-alive'}
source = requests.get(URL,headers=hdr)

soup = BeautifulSoup(source.content, features="html.parser")
# print(soup)
company_name = soup.find_all('h1', class_= 'companyName__9bd88132')

company_description = soup.find_all('div', class_ = 'description__ce057c5c')

print(company_name)
print(company_description)
© www.soinside.com 2019 - 2024. All rights reserved.