我制作了一个机器人来抓取运动鞋的价格,但我在获取价格方面遇到了麻烦,这里有人帮助我找到了请求和抓取工作的正确 URL。
产品原始网址:https://www.vans.com.br/tenis-ultrarange-rapidweld-black-white/p/1003500430051U?gad_source=1 网址修改为废品:.com.br/arezzocoocc/v2/vans/products/1003500430051U/dynamic-product-fields?fields=DYNAMIC_FIELDS_PDP 我要修改的新网址:https://www.nike.com.br/tenis-nike-pegasus-40-masculino-025803.html?cor=ID
这是我查找这款运动鞋价格的代码:
import requests
import smtplib
import email.message
import ssl
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:121.0) Gecko/20100101 Firefox/121.0"
}
def get_product_data(url, number):
try:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
return next((c for c in data.get("colorOptions", []) if c["code"] == number), None)
except requests.RequestException as e:
print(f"Error fetching product data for {number}: {e}")
return None
def send_email(product_name, product_price, receiver_email):
subject = "Price Drop Alert!"
body_msg = f'''The price of {product_name} has dropped to {product_price}.'''
message = f"Subject: {subject}\n\n{body_msg}"
sender = '<REDACTED>'
password = '<REDACTED>'
receiver = '<REDACTED>'
context = ssl.create_default_context()
with smtplib.SMTP_SSL('smtp.gmail.com', 465, context=context) as server:
server.login(sender, password)
server.sendmail(sender, receiver, message.encode('utf-8'))
number1 = "1003500430051U"
number2 = "1002001070011U"
url1 = f"https://www.vans.com.br/arezzocoocc/v2/vans/products/{number1}/dynamic-product-fields?fields=DYNAMIC_FIELDS_PDP"
url2 = f"https://www.vans.com.br/arezzocoocc/v2/vans/products/{number2}/dynamic-product-fields?fields=DYNAMIC_FIELDS_PDP"
product1 = get_product_data(url1, number1)
product2 = get_product_data(url2, number2)
if product1 and "price" in product1:
productprice1 = product1["price"]["value"]
print(product1["name"], productprice1)
if product2 and "price" in product2:
print(product2["name"], product2["price"]["value"])
try:
data1 = requests.get(url1, headers=headers).json()
data2 = requests.get(url2, headers=headers).json()
except requests.RequestException as e:
print(f"Error fetching product data: {e}")
for c in data1["colorOptions"]:
if c["code"] == number1:
productprice1 = data1["price"]["value"]
print(c["name"], productprice1)
break
for c in data2["colorOptions"]:
if c["code"] == number2:
print(c["name"], data2["price"]["value"])
break
if productprice1 and productprice1 < 600:
send_email(product1["name"], productprice1, '[email protected]')
但它不起作用,如果有人可以帮助我如何在任何 URL 中获取此内容,这对我的剪贴程序非常有帮助,这样我就可以知道如何在需要时搜索任何产品。
由于这是一个不同的网站(Nike,不是 Vans) - 您不需要获得授权才能使用该网站吗?
number1 = "1003500430051U"
number2 = "1002001070011U"
url1 = f"https://www.vans.com.br/arezzocoocc/v2/vans/products/{number1}/dynamic-product-fields?fields=DYNAMIC_FIELDS_PDP"
url2 = f"https://www.vans.com.br/arezzocoocc/v2/vans/products/{number2}/dynamic-product-fields?fields=DYNAMIC_FIELDS_PDP"
您还向 Vans URL 提供了参数。但不是耐克的。
number3 = ?
url3 = f"https://www.nike.com.br/tenis-nike-pegasus-40-masculino-025803.html?cor=ID"
更改此行:
product1 = get_product_data(url1, number1)
product2 = get_product_data(url3, number2)
与
url3 = f"https://www.nike.com.br/tenis-nike-pegasus-40-masculino-025803.html?cor=ID"
产生错误:
Error fetching product data for 1002001070011U: 403 Client Error: Forbidden for url: https://www.nike.com.br/tenis-nike-pegasus-40-masculino-025803.html?cor=ID
Tennis Ultrarange Rapidweld Black White 549.99
Tennis Old Skool Black White 399.99
这意味着这是一个授权错误。您可能需要 API 密钥或其他形式的 Nike 授权。