Scrap with bs4 two values for one variable, how can I use just one?

问题描述 投票:0回答:1

我想得到不同亚马逊页面的价格。问题是当产品是 "amazon choice "时,价格是在不同的 "div "中,所以......我怎样才能检查价格是在一个标签中还是在另一个标签中,然后,将该值保存为价格。然后给该值价格一个int格式,准备保存到xlsx文件的一个单元格。我有这个代码没有错误,但我不能删除打印结果的选项无。如果我尝试替换gimme错误,如果我使用get_text gimme错误。所以......我不知道还能做什么。我在空白。

URL = 'https://www.amazon.com.mx/dp/B07MJP47M5'

headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}

page = requests.get(URL, headers=headers)
scrap = soup(page.content, 'html.parser')

price_1 = scrap.find(id='priceblock_ourprice')
if price_1 != None:
    price_1 = scrap.find(id='priceblock_ourprice').get_text()


price_2 = scrap.find(id='priceblock_saleprice')
if price_2 != None:
    price_2 = scrap.find(id='priceblock_saleprice').get_text()



price = (price_1, price_2)
print(price)
python-3.x web-scraping beautifulsoup
1个回答
0
投票

我已经解决了这个问题。

URL = 'https://www.amazon.com.mx/dp/B07T514NTL'

headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}

page = requests.get(URL, headers=headers)
scrap = soup(page.content, 'html.parser')

# se extrae el precio y se le da formato, se quita la coma, y las decimales y el simbolo

price_1 = scrap.find(id='priceblock_ourprice')
if price_1 is None:
    price_1 = scrap.find(id='priceblock_saleprice')

price_4 = price_1.get_text()

verted_price = price_4[1:10]
onverted_price = verted_price.replace(',','')
sep = '.'
converted_price = onverted_price.split(sep, 1)[0]

print(converted_price)
© www.soinside.com 2019 - 2024. All rights reserved.