一个用beautifulsoup的解决方案,从View All Offers-tab中获取所有的销售价格和卖家名称,可以是这样的。
from bs4 import BeautifulSoup
from requests import get
url = 'https://www.noon.com/uae-en/iphone-11-with-facetime-black-128gb-4g-lte-international-specs/N29884715A/p?o=eaf72ceb0dd3bc9f'
resp = get(url).text
soup = BeautifulSoup(resp, 'lxml')
for offer in soup.find_all("li", class_="item"):
print(offer.find("span", class_="sellingPrice").find("span", class_="value").text)
print(offer.find("div", class_="sellerDetails").strong.text)
在Scrapy中的解决方案可以是这样的。
import scrapy
class noonSpider(scrapy.Spider):
name = "noon"
start_urls = ['https://www.noon.com/uae-en/iphone-11-with-facetime-black-128gb-4g-lte-international-specs/N29884715A/p?o=eaf72ceb0dd3bc9f/p?o=b478235d26032e5a']
def parse(self, response):
yield {
'sellingPrice': response.css('.offersList .sellingPrice .value::text').getall(),
'seller': response.css('.offersList .sellerDetails strong::text').getall(),
}