抓取在抓取过程中停止

问题描述 投票:0回答:1

我正在尝试通过BeautifulSoup取消产品列表。网站上有80种产品列表。它运作良好,但停在第32个产品上。我如何报废所有产品。

import requests
from bs4 import BeautifulSoup

from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.dbsparta

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get('https://www.stories.com/kr_krw/top-sellers/top-sellers.html', headers=headers)

soup = BeautifulSoup(data.text, 'html.parser')
#image = #category-list > div:nth-child(1) > a > div.product-image > div > img.a-image.default-image -> src attr.
#name = #category-list > div:nth-child(1) > a > div.description > div.product-title > label -> text
#price = #category-list > div:nth-child(1) > a > div.description > div.m-product-price > label -> text

products = soup.select('#category-list > div.o-product')

for product in products:
    image = product.select_one('div.product-image > div > img.a-image.default-image')['src']
    name = product.select_one('div.description > div.product-title > label').text
    price = product.select_one('div.description > div.m-product-price > label').text
    print(image,name,price)
python web-scraping beautifulsoup web-crawler
1个回答
0
投票

数据是通过JavaScript动态加载的,但是可以使用requests模块进行模拟。

例如:

import requests
from bs4 import BeautifulSoup

url = 'https://www.stories.com/kr_krw/top-sellers/top-sellers.html'
ajax_url = 'https://www.stories.com/kr_krw/dpa/aosCtgrItemAddList.html'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

dispLcatCd = soup.select_one('#dispLcatCd')['value']
dispMcatCd = soup.select_one('#dispMcatCd')['value']

data = {
    'sect_id': dispMcatCd,
    'dispLcatCd': dispLcatCd,
    'dispMcatCd': dispMcatCd,
    'pageNum': 1,
    'viewCnt': 32,
    }

while True:
    print('Processing page {}...'.format(data['pageNum']))
    soup = BeautifulSoup(requests.post(ajax_url, data=data).content, 'html.parser')

    if not soup.select('.o-product'):
        break

    for title, img, price in zip(soup.select('.product-title'),
                                 soup.select('.default-image'),
                                 soup.select('.price')):
        print('{:<50} {:<10} {}'.format(title.get_text(strip=True), price.get_text(strip=True), img['src']))

    data['pageNum'] += 1

打印:

Processing page 1...
버튼 맥시 스트랩 드레스                                      129,000    https://image.thehyundai.com/static/4/4/1/14/A1/hnm40A1141441_01_0864704_001_002_568.jpg
하프 문 스트로 크로스바디 백                                   69,000     https://image.thehyundai.com/static/9/8/0/07/A1/hnm40A1070896_02_0838559_001_001_568.jpg
러플 코튼 도비 미디 드레스                                    119,000    https://image.thehyundai.com/static/4/6/7/87/A0/hnm40A0877646_01_0727841_001_001_568.jpg
스트라이프 스트랩 레더 샌들                                    89,000     https://image.thehyundai.com/static/9/1/3/14/A1/hnm40A1143195_03_0852209_001_001_568.jpg
플로럴 미디 랩 드레스                                       110,000    https://image.thehyundai.com/static/4/6/7/04/A1/hnm40A1047640_01_0680108_003_001_568.jpg
오가닉 펄 오픈 후프 이어링                                    35,000     https://image.thehyundai.com/static/3/3/0/99/A0/hnm40A0990337_02_0846451_001_001_568.jpg
플리츠 미디 스커트                                         89,000     https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137566_01_0883732_001_002_568.jpg
플로럴 프린트 맥시 드레스                                     129,000    https://image.thehyundai.com/static/7/7/6/10/A1/hnm40A1106773_01_0493476_006_001_568.jpg
깅엄 시어서커 스윔수트                                       79,000     https://image.thehyundai.com/static/4/5/3/14/A1/hnm40A1143548_02_0882631_001_001_568.jpg
리넨 쇼츠                                              79,000     https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136396_01_0883866_001_001_568.jpg
패디드 레더 샌들                                          110,000    https://image.thehyundai.com/static/7/4/4/13/A1/hnm40A1134475_02_0851451_001_001_568.jpg
리본 브림 우븐 스트로 햇                                     39,000     https://image.thehyundai.com/static/4/2/8/02/A1/hnm40A1028243_02_0848837_001_001_568.jpg
프릴 퍼프 슬리브 니트 탑                                     69,000     https://image.thehyundai.com/static/6/5/3/14/A1/hnm40A1143560_01_0886586_002_001_568.jpg
레더 스트래피 레이스 업 힐 샌들                                 119,000    https://image.thehyundai.com/static/0/3/8/88/A0/hnm40A0888301_02_0731706_003_001_568.jpg
프릴 크레이프 시폰 미디 드레스                                  129,000    https://image.thehyundai.com/static/1/7/2/12/A1/hnm40A1122717_01_0866370_001_001_568.jpg
슬리브리스 프릴 블라우스                                      59,000     https://image.thehyundai.com/static/9/7/8/13/A1/hnm40A1138796_0864574001_202001_LB_0020_Q8_L_1120x868_srgb_568.jpg
캔버스 토프 블러셔                                         15,000     https://image.thehyundai.com/static/9/3/2/41/A0/hnm40A0412390_02_0148486_010_001_568.jpg
스캘럽 헴 리넨 쇼츠                                        89,000     https://image.thehyundai.com/static/7/5/7/13/A1/hnm40A1137570_01_0902708_001_001_568.jpg
트윌 슬링백 버클 샌들                                       79,000     https://image.thehyundai.com/static/9/1/3/14/A1/hnm40A1143199_02_0888324_001_001_568.jpg
에이시메트릭 랩 미디 드레스                                    110,000    https://image.thehyundai.com/static/6/0/5/08/A1/hnm40A1085069_0853055001_202001_LB_0982_Q8_L_1120x868_srgb_568.jpg
피티드 스모크 스커트                                        79,000     https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136395_01_0784826_011_001_568.jpg
캔버스 토트 백                                           110,000    https://image.thehyundai.com/static/9/4/7/09/A1/hnm40A1097499_02_0838566_002_001_568.jpg
마이크로 플로럴 랩 미니 드레스                                  79,000     https://image.thehyundai.com/static/1/7/2/12/A1/hnm40A1122713_01_0751883_004_001_568.jpg
피티드 스모크 셔츠                                         110,000    https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136394_01_0859052_001_002_568.jpg
오픈 백 점프수트                                          110,000    https://image.thehyundai.com/static/1/6/0/13/A1/hnm40A1130610_01_0917714_001_001_568.jpg
벨티드 퍼프 슬리브 미디 드레스                                  119,000    https://image.thehyundai.com/static/9/7/8/13/A1/hnm40A1138797_0874015002_202001_LB_0175_Q8_L_1120x868_srgb_568.jpg
리넨 퍼프 슬리브 미니 드레스                                   119,000    https://image.thehyundai.com/static/5/5/3/14/A1/hnm40A1143553_01_0900126_001_001_568.jpg
자카드 랩 맥시 드레스                                       129,000    https://image.thehyundai.com/static/4/9/6/12/A1/hnm40A1126946_0887633001_202001_LB_0862_Q8_L_1120x868_srgb_568.jpg
오버사이즈 벨티드 리넨 점프수트                                  119,000    https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137564_01_0871142_001_001_568.jpg
오버사이즈 버튼 셔츠 드레스                                    79,000     https://image.thehyundai.com/static/0/4/6/13/A1/hnm40A1136405_0880475001_202002_LB_1106_Q8_L_1120x868_srgb_568.jpg
패디드 레더 슬링백 샌들                                      119,000    https://image.thehyundai.com/static/7/4/4/13/A1/hnm40A1134476_02_0876166_002_001_568.jpg
듀오 톤 레더 크로스바디 백                                    225,000    https://image.thehyundai.com/static/5/7/3/99/A0/hnm40A0993757_02_0775965_001_001_568.jpg
Processing page 2...
리넨 블렌드 블레이저                                        119,000    https://image.thehyundai.com/static/1/3/5/12/A1/hnm40A1125312_01_0852710_001_001_568.jpg
A라인 러플 미니 드레스                                      89,000     https://image.thehyundai.com/static/0/6/2/11/A1/hnm40A1112606_01_0887613_001_001_568.jpg
릴렉스드 버튼 미디 드레스                                     79,000     https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137565_01_0864561_001_001_568.jpg
리넨 퍼프 슬리브 미디 드레스                                   129,000    https://image.thehyundai.com/static/8/7/0/13/A1/hnm40A1130783_0881161003_202002_LB_0491_Q8_L_1120x868_srgb_568.jpg

... and son on.
© www.soinside.com 2019 - 2024. All rights reserved.