解析HTML []输出

问题描述 投票:2回答:1

[早上好,我正在尝试从网站https://shop.fattoriaterranova.it/it/14-marmellate中提取每个果酱罐的价格和成本。

这是我的代码:

#import modules
import urllib.request, urllib.parse, urllib.error
from urllib import request
from bs4 import BeautifulSoup
import ssl

# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

#BeautifulSoup & url
url = 'https://shop.fattoriaterranova.it/it/14-marmellate'
html = request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html,"html.parser")

results = soup.find(id='product_list')
products = results.find_all('ul', class_='product_list grid row')

print(products)

for product in products:
    price_elem = product.find('span', class_='price product-price')
    prod_elem = product.find('a', class_='product-name')
    if None in (price_elem, prod_elm):
        continue
    print(price_elem.strip())
    print(prod_elem.strip())
    print(results.strip())

我得到的输出是

[ ]

我在做什么错?

谢谢

python-3.x parsing html-parsing
1个回答
1
投票

您正在寻找的类中正在搜索同一类。

results = soup.find(id='product_list')
products = results.find_all('ul', class_='product_list grid row')

产品包含product_list的所有子元素,因此在此处再次搜索product_list不会返回任何结果。相反,您应该搜索包含价格和产品信息的li元素

#import modules
import urllib.request, urllib.parse, urllib.error
from urllib import request
from bs4 import BeautifulSoup
import ssl

# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

#BeautifulSoup & url
url = 'https://shop.fattoriaterranova.it/it/14-marmellate'
html = request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html,"html.parser")
results = soup.find(id='product_list')
products = results.find_all('li', class_='ajax_block_product')

for product in products:
    price_elem = product.find('span', class_='price product-price').string
    prod_elem = product.find('a', class_='product-name').string
    if price_elem and prod_elem:
        print(f"{prod_elem}: {price_elem}")

输出

Marmellata di Limoni 212 ml :  3,50 € 
Marmellata di Limoni 314 ml :  4,00 € 
Marmellata di Arance 212 ml :  3,50 € 
Marmellata di Mandarini :  4,50 € 
Marmellata di Arance 106 ml :  2,70 € 
Marmellata di Arance 314 ml :  4,00 € 
Marmellata di Pesche 314 ml :  4,50 € 
Marmellata di Arance Amare 212 ml :  3,50 € 
Marmellata di Albicocche 314 ml :  4,50 € 
Marmellata di Amarene 212 ml :  6,00 € 
Marmellata di Limoni 106 ml :  2,70 € 
Marmellata di Fichi 212 ml :  5,00 € 
Confettura di Peperoncini piccanti :  4,00 € 
Marmellata di Limoni & Zenzero 212 ml :  4,00 € 
© www.soinside.com 2019 - 2024. All rights reserved.