从webscrape Python发布问题的价格

问题描述 投票:0回答:2

遵循指南,但仍然无法从网站中选择价格-我想选择产品名称和产品价格。

我可以选择控制台中出现的名称=。价格返回“无”。请Nonetype错误,我没有错。

page = requests.get('https://www.wickes.co.uk/search?text=brick')
soup = BeautifulSoup(page.content, 'html.parser')
all_bricks = soup.find(class_='products-list products-list-v2')

items = all_bricks.find(class_='card product-card')
items_name = all_bricks.find(class_='product-card__title product-card__title-v2')

price_box = items.find("div", attrs={"class": "product-card__price-value "})
price = price_box
print (price)
python web screen-scraping
2个回答
1
投票

好,这里有两个问题:

  1. 您在班级名称中添加了额外的空格。 BeautifulSoup设计用于修剪html DOM中的多余空格。
  2. 您未使用.text取回价格。
from bs4 import BeautifulSoup
import requests

page = requests.get('https://www.wickes.co.uk/search?text=brick')
soup = BeautifulSoup(page.content, 'html.parser')
all_bricks = soup.find(class_='products-list products-list-v2')

items = all_bricks.find(class_='card product-card')
items_name = all_bricks.find(class_='product-card__title product-card__title-v2')

price_box = items.find("div", attrs={"class": "product-card__price-value"}) #Extra space removed
price = price_box.text #adding ".text"
print (price)

0
投票

要获取所有名称和价格,您可以直接搜索它们。

page = requests.get('https://www.wickes.co.uk/search?text=brick')
soup = BeautifulSoup(page.content, 'lxml')
names = [x.text.strip() for x in soup.find_all('a', {'class': 'product-card__title product-card__title-v2'})]
prices = [x.text.strip() for x in soup.find_all('div', {'class': 'product-card__price-value '})]
print(names[0], prices[0])
© www.soinside.com 2019 - 2024. All rights reserved.