Python 编程新手。尝试编写一个自动解析产品卡的脚本。我设法应付了一页。如何让脚本自动跳转到另一个页面。我看到了几个硒有帮助的答案。但我无法弄清楚:( 这是代码:
import random
import string
import csv
import requests
from bs4 import BeautifulSoup
url = "https://game29.ru/products?category=926"
response = requests.get(url)
html = response.text
multi_class = {'class': ['row'], 'style': 'border: 2px solid #898989;border-radius: 7px;padding: 2px;margin-top: -2px;'}
soup = BeautifulSoup(html, "html.parser")
products = soup.find_all("div", {"class":"row"})
identifaer = "".join([random.choice(string.ascii_letters + string.digits) for n in range(32)])
ad_status = "Free"
category = "Игры, приставки и программы"
goods_type = "Игры для приставок"
ad_type = "Продаю своё"
adress = ""
discription = ""
condition = "Новое"
data_begin = "2024-04-03"
data_end = "2024-05-03"
allow_email = "Нет"
contact_phone = ""
contact_method = "По телефону и в сообщениях"
all_products = []
for product in products:
if product.attrs == multi_class:
identifaer
image ="https://www.game29.ru" + product.find("img")["src"]
if image != "https://game29.ru/zaglushka.png":
title = product.find("div", {"class":"cart-item-name"}).text
price = product.find("div", {"class": "cart-item-price"}).text.strip().replace("руб.", "")
all_products.append([identifaer, ad_status, category, goods_type, ad_type, adress, title, discription, condition, price, data_begin, data_end, allow_email, contact_phone, image, contact_method])
# names = ["Id", "AdStatus", "Category", "GoodsType", "Adtype", "Adress", "Title", "Discription", "Condition", "Price", "DataBegin", "DataEnd", "AllowEmail", "ContactPhone","ImageUrls", "ContactMethod"]
with open("data.csv", "a", newline='') as csv.file:
writer = csv.writer(csv.file, delimiter=',')
# writer.writerow(names)
for product in all_products:
writer.writerow(product)
我真的认为硒会对我有帮助。我认为答案就在那里,但不幸的是,我还不明白,但我没有太多时间。亲爱的大师,如果您能帮助我,我会很高兴。
如果您查看该网站,您会发现单击任何页码都会修改 URL 以包含
page=
属性。例如,第 2 页可通过地址 https://game29.ru/products?page=2&category=926 访问。因此,您应该创建一个处理每个页面的函数,然后从递增页码的循环中调用该函数。比如:
def parser(url):
# add the beautiful soup and parsing code here
# return True or False to indicat that the page was processed
# The main loop is something like
page_number = 1
while True:
url = F'https://game29.ru/products?page={page_number}&category=926'
if parser(url) == False:
break # stop processing