Python web抓取特定类中的find_all(“a”)

问题描述 投票:0回答:1

我是网络抓取的新手,我正在处理我的小项目。

任务是得到"cameras",他们的"prices""quick specs"的名字

(来自:https://www.dpreview.com/products/cameras/all?page=1)。

最后提到我可以在我点击相机时将我转到新网址。

当我查看页面时,我必须从那里获取URL,但是只有:

for link in soup.find_all("a"):
    print(link.get("href"))

我将获得所有链接(所以一些额外的,如登录,社交媒体等):

我想要做的就是从特定的课程中提到之前。

你能帮帮我吗? (或者至少指向我讨论这个的教程?)。

我正在使用BeautifulSoup in Python3.

python web-scraping beautifulsoup findall
1个回答
1
投票

您可以通过td将搜索结果锚定到class="product"产品列表来找到所需信息:

import requests, typing
class Camera(typing.NamedTuple):
  info:typing.List[str]
  quicklook:str
  price:str

from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://www.dpreview.com/products/cameras/all?page=1').text, 'html.parser')
headers = [['div', 'name'], ['div', 'specs'], ['div', 'prices']]
vals = [[(lambda x:getattr(x, 'text', 'N/A') if b != 'name' else [getattr(x, 'text', 'N/A'), i.a['href']])(i.find(a, {'class':b})) for a, b in headers] for i in d.find_all('td', {'class':'product'})]
final_result = [Camera(*i) for i in vals]

输出:

[Camera(info=['Fujifilm X-T3', 'https://www.dpreview.com/products/fujifilm/slrs/fujifilm_xt3'], quicklook='26 megapixels | 3″ screen | APS-C sensor', price='$1,499.00 - $2,898.00'), Camera(info=['Canon EOS R', 'https://www.dpreview.com/products/canon/slrs/canon_eos_r'], quicklook='30 megapixels | 3.2″ screen | Full frame sensor', price='Check prices'), Camera(info=['Sony Cyber-shot DSC-HX99', 'https://www.dpreview.com/products/sony/compacts/sony_dschx99'], quicklook='18 megapixels | 3″ screen | 24 – 720 mm (30×)', price='Check prices'), Camera(info=['Sony Cyber-shot DSC-HX95', 'https://www.dpreview.com/products/sony/compacts/sony_dschx95'], quicklook='18 megapixels | 3″ screen | 24 – 720 mm (30×)', price='Check prices'), Camera(info=['Nikon D3500', 'https://www.dpreview.com/products/nikon/slrs/nikon_d3500'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$496.95 - $596.95'), Camera(info=['Nikon Z6', 'https://www.dpreview.com/products/nikon/slrs/nikon_z6'], quicklook='25 megapixels | 3.2″ screen | Full frame sensor', price='$1,996.95 - $3,443.90'), Camera(info=['Nikon Z7', 'https://www.dpreview.com/products/nikon/slrs/nikon_z7'], quicklook='46 megapixels | 3.2″ screen | Full frame sensor', price='$3,396.95 - $4,843.90'), Camera(info=['Panasonic Lumix DC-LX100 II', 'https://www.dpreview.com/products/panasonic/compacts/panasonic_dclx100ii'], quicklook='17 megapixels | 3″ screen | 24 – 75 mm (3.1×)', price='$997.99 - $1,095.98'), Camera(info=['Leica M10-P', 'https://www.dpreview.com/products/leica/slrs/leica_m10_p'], quicklook='24 megapixels | 3″ screen | Full frame sensor', price='Check prices'), Camera(info=['Canon PowerShot SX740 HS', 'https://www.dpreview.com/products/canon/compacts/canon_sx740hs'], quicklook='21 megapixels | 3″ screen | 24 – 960 mm (40×)', price='$399.00'), Camera(info=['Fujifilm XF10', 'https://www.dpreview.com/products/fujifilm/compacts/fujifilm_xf10'], quicklook='24 megapixels | 3″ screen', price='$499.00'), Camera(info=['Sony Cyber-shot DSC-RX100 V(A)', 'https://www.dpreview.com/products/sony/compacts/sony_dscrx100m5a'], quicklook='20 megapixels | 3″ screen | 24 – 70 mm (2.9×)', price='$898.00 - $1,096.00'), Camera(info=['Nikon Coolpix P1000', 'https://www.dpreview.com/products/nikon/compacts/nikon_cpp1000'], quicklook='16 megapixels | 3.2″ screen | 24 – 3000 mm (125×)', price='$996.95 - $1,041.90'), Camera(info=['Fujifilm instax mini 90 NEO CLASSIC', 'https://www.dpreview.com/products/fujifilm/compacts/fujifilm_instax_mini_90'], quicklook='N/A', price='$112.00 - $121.30'), Camera(info=['Leica C-Lux', 'https://www.dpreview.com/products/leica/compacts/leica_c-lux_2018'], quicklook='20 megapixels | 3″ screen | 24 – 360 mm (15×)', price='Check prices'), Camera(info=['Sony Cyber-shot DSC-RX100 VI', 'https://www.dpreview.com/products/sony/compacts/sony_dscrx100m6'], quicklook='20 megapixels | 3″ screen | 24 – 200 mm (8.3×)', price='$1,198.00 - $1,265.16'), Camera(info=['Fujifilm X-T100', 'https://www.dpreview.com/products/fujifilm/slrs/fujifilm_xt100'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$599.00 - $899.00'), Camera(info=['Panasonic Lumix DC-TS7 (Lumix DC-FT7)', 'https://www.dpreview.com/products/panasonic/compacts/panasonic_dcts7'], quicklook='20 megapixels | 3″ screen | 28 – 128 mm (4.6×)', price='$447.99'), Camera(info=['GoPro Hero (2018)', 'https://www.dpreview.com/products/gopro/actioncams/gopro_hero_2018'], quicklook='10 megapixels | Compact sensor', price='$189.90 - $233.39'), Camera(info=['Sony Alpha a7 III', 'https://www.dpreview.com/products/sony/slrs/sony_a7iii'], quicklook='24 megapixels | 3″ screen | Full frame sensor', price='$1,998.00 - $4,396.00'), Camera(info=['Canon EOS M50 (EOS Kiss M)', 'https://www.dpreview.com/products/canon/slrs/canon_eosm50'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$629.00 - $948.48'), Camera(info=['Canon EOS Rebel T7 (EOS 2000D)', 'https://www.dpreview.com/products/canon/slrs/canon_eos2000d'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='Check prices'), Camera(info=['Canon EOS 4000D', 'https://www.dpreview.com/products/canon/slrs/canon_eos4000d'], quicklook='18 megapixels | 2.7″ screen | APS-C sensor', price='Check prices'), Camera(info=['Pentax K-1 Mark II', 'https://www.dpreview.com/products/pentax/slrs/pentax_k1ii'], quicklook='36 megapixels | 3.2″ screen | Full frame sensor', price='$1,896.95 - $2,296.95'), Camera(info=['Fujifilm X-H1', 'https://www.dpreview.com/products/fujifilm/slrs/fujifilm_xh1'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$2,648.00 - $3,548.00'), Camera(info=['Panasonic Lumix DC-GX9', 'https://www.dpreview.com/products/panasonic/slrs/panasonic_dcgx9'], quicklook='20 megapixels | 3″ screen | Four Thirds sensor', price='Check prices'), Camera(info=['Panasonic Lumix DC-ZS200 (Lumix DC-TZ200)', 'https://www.dpreview.com/products/panasonic/compacts/panasonic_dczs200'], quicklook='20 megapixels | 3″ screen | 24 – 360 mm (15×)', price='Check prices'), Camera(info=['Olympus PEN E-PL9', 'https://www.dpreview.com/products/olympus/slrs/olympus_epl9'], quicklook='16 megapixels | 3″ screen | Four Thirds sensor', price='$599.00 - $699.00'), Camera(info=['Panasonic Lumix DC-GF10 (GF90)', 'https://www.dpreview.com/products/panasonic/slrs/panasonic_dcgf10'], quicklook='16 megapixels | 3″ screen | Four Thirds sensor', price='Check prices'), Camera(info=['Fujifilm X-A5', 'https://www.dpreview.com/products/fujifilm/slrs/fujifilm_xa5'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$599.00 - $799.00'), Camera(info=['Fujifilm FinePix XP130', 'https://www.dpreview.com/products/fujifilm/compacts/fujifilm_xp130'], quicklook='16 megapixels | 3″ screen | 28 – 140 mm (5×)', price='$169.00 - $179.00'), Camera(info=['Panasonic Lumix DC-GH5S', 'https://www.dpreview.com/products/panasonic/slrs/panasonic_dcgh5s'], quicklook='10 megapixels | 3.2″ screen | Four Thirds sensor', price='$2,297.99 - $3,395.98'), Camera(info=['Leica CL', 'https://www.dpreview.com/products/leica/slrs/leica_cl'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$3,995.00'), Camera(info=['Panasonic Lumix DC-G9', 'https://www.dpreview.com/products/panasonic/slrs/panasonic_dcg9'], quicklook='20 megapixels | 3″ screen | Four Thirds sensor', price='$1,497.99 - $3,995.98'), Camera(info=['Rylo Camera', 'https://www.dpreview.com/products/rylo/actioncams/rylo_camera'], quicklook='N/A', price='$497.85 - $499.00'), Camera(info=['Xiaomi Mi Sphere 3.5K', 'https://www.dpreview.com/products/xiaomi/actioncams/xiaomi_mi_sphere_3p5k'], quicklook='16 megapixels | 2 lens(es) | Compact sensor', price='Check prices'), Camera(info=['Sony Alpha a7R III', 'https://www.dpreview.com/products/sony/slrs/sony_a7riii'], quicklook='42 megapixels | 3″ screen | Full frame sensor', price='$3,213.00 - $3,997.00'), Camera(info=['Canon PowerShot G1 X Mark III', 'https://www.dpreview.com/products/canon/compacts/canon_g1xiii'], quicklook='24 megapixels | 3″ screen | 24 – 72 mm (3×)', price='$1,099.00'), Camera(info=['GoPro Hero6 Black', 'https://www.dpreview.com/products/gopro/actioncams/gopro_hero6_black'], quicklook='12 megapixels', price='$401.97 - $412.98'), Camera(info=['Sony Cyber-shot DSC-RX10 IV', 'https://www.dpreview.com/products/sony/compacts/sony_dscrx10iv'], quicklook='20 megapixels | 3″ screen | 24 – 600 mm (25×)', price='$1,698.00 - $1,788.93'), Camera(info=['Fujifilm X-E3', 'https://www.dpreview.com/products/fujifilm/slrs/fujifilm_xe3'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$899.00 - $1,449.23'), Camera(info=['Ricoh Theta V', 'https://www.dpreview.com/products/ricoh/actioncams/ricoh_theta_v'], quicklook='12 megapixels | 2 lens(es) | Compact sensor', price='$396.99 - $616.98'), Camera(info=['Olympus OM-D E-M10 III', 'https://www.dpreview.com/products/olympus/slrs/oly_em10iii'], quicklook='16 megapixels | 3″ screen | Four Thirds sensor', price='$549.00 - $799.00'), Camera(info=['Sony DSC-RX0', 'https://www.dpreview.com/products/sony/compacts/sony_dscrx0'], quicklook='15 megapixels | 1.5″ screen', price='$598.00 - $646.00'), Camera(info=['Canon EOS M100', 'https://www.dpreview.com/products/canon/slrs/canon_eosm100'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$449.00 - $679.00'), Camera(info=['Nikon D850', 'https://www.dpreview.com/products/nikon/slrs/nikon_d850'], quicklook='45 megapixels | 3.2″ screen | Full frame sensor', price='$3,296.95 - $6,896.90'), Camera(info=['Leica TL2', 'https://www.dpreview.com/products/leica/slrs/leica_tl2'], quicklook='24 megapixels | 3.7″ screen | APS-C sensor', price='$2,195.00'), Camera(info=['Canon EOS 6D Mark II', 'https://www.dpreview.com/products/canon/slrs/canon_eos6dmkii'], quicklook='26 megapixels | 3″ screen | Full frame sensor', price='$1,599.00 - $2,848.00'), Camera(info=['Canon EOS Rebel SL2 (EOS 200D / Kiss X9)', 'https://www.dpreview.com/products/canon/slrs/canon_eos200d'], quicklook='24 megapixels | 3″ screen | APS-C sensor', price='$549.00 - $948.00'), Camera(info=['Nikon Coolpix W300', 'https://www.dpreview.com/products/nikon/compacts/nikon_cpw300'], quicklook='16 megapixels | 3″ screen | 24 – 120 mm (5×)', price='$386.95')]
© www.soinside.com 2019 - 2024. All rights reserved.