检查本网站“40”号是否有货并发送 Discord 消息

问题描述 投票:0回答:2

我正在尝试让机器人在 40 号有货时发送不和谐消息。我之前尝试过使用此代码来完成此类操作,但现在我不知道在 HTML 中的哪个位置可以检查鞋子是否有 40 码的库存。

 stock_element = soup.find('li', {'class': 'unselectable'})

我真的不明白如何使这部分代码找到指示是否有库存的特定元素。

https://www.courir.com/fr/p/ugg-tasman-1499533.html

这是我尝试过的代码:

import discord
from bs4 import BeautifulSoup
import requests
import aiohttp
import asyncio

webhook_url = 'DISCORD API'

async def send_webhook_message(content):
    async with aiohttp.ClientSession() as session:
        async with session.post(webhook_url, json={"content": content}) as response:
            if response.status == 204:
                print("Discord message sent successfully.")
            else:
                print(f"Error sending Discord message. Status code: {response.status}")

async def check_stock(url, size):
    try:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        stock_element = soup.find('li', {'class': 'unselectable'})

        # Check if the stock_element is found and is not disabled
        return stock_element is not None and 'Size Variations' not in stock_element.attrs

    except requests.RequestException as e:
        print(f"Error making request to the website: {e}")
        return False

product_data = [
    {'url': 'https://www.courir.com/fr/p/ugg-tasman-1499533.html', 'size': 'tasman'},
]
async def main():
    previous_stock_status = {}  # Dictionary to store previous stock status for each product

    while True:
        result_message = ""

        for product_info in product_data:
            url = product_info['url']
            size = product_info['size']
            current_stock_status = await check_stock(url, size)

            # Check if the product was out of stock in the previous check and is now in stock
            if not previous_stock_status.get(url, {}).get(size, False) and current_stock_status:
                result_message += f"({size}) is now in stock\n"

            previous_stock_status.setdefault(url, {})[size] = current_stock_status

        # Check if the result message is not equal to the specific message you want to avoid
        if result_message.strip() != "(NAVY S) is now in stock\n(NAVY M) is now in stock\n(NAVY L) is now in stock!\n(BEIGE S) is now in stock!\n(BEIGE M) is now in stock!\n(BEIGE L) is now in stock!":
            # Print the result message
            print(result_message)

            # Send Discord message only if there are updates
            if result_message:
                await send_webhook_message(result_message)
                print("Sent Discord message.")

        # Sleep for a certain interval (e.g., 10 minutes) before checking again
        print("10 min before next check")
        await asyncio.sleep(200)  # sleep for 10 minutes

# Run the main function asynchronously
asyncio.run(main())
python beautifulsoup python-requests discord
2个回答
0
投票

检查“类别”:“不可选择”仅表明该产品是否没有有库存。当它可用时,它将变为“可选择”。

我们将使用 Firefox 或 Chrome 中的检查器 (CTRL-SHIFT-I) 来了解要搜索的属性。在本例中,查找“swatchanchor”类的锚点,其标题包含“40”。其父级属于 class='selectable' 会告诉我们它有库存。通常,这样的事情可能会起作用。

size_40_available = soup.find('li', class_='selectable').find_all('a', class_='swatchanchor', title__contains='40')

但问题是,我们要查找的元素并不在页面源代码中。它们是由 JavaScript 动态生成的。我们将无法使用传统的网络抓取方法(例如 Beautiful Soup 或 Scrapy)找到它们。相反,我们可能需要使用像 Puppeteer 或 Selenium 这样的无头浏览器来与页面交互并提取所需的元素。


0
投票

..但我不知道哪里HTML中我可以检查40码的鞋子是否有库存。

看起来这个是由JavaScript加载的:

url = "https://www.courir.com/fr/p/ugg-tasman-1499533.html"

headers = {
    "User-Agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N)"
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36"
}

payload = {
    "frz-smartcache-fragment": "true",
    "frz-timeout": "5000",
    "frz-smartcache-v": "2",
    "frz-smartcache-placeholders-number": "8"
}

r = requests.get(url, headers=headers, params=payload)

soup = BeautifulSoup(re.search(r"(<.*>)", r.text)
            .group().replace("\\", ""), "html.parser")

sizes = {}

for tag in soup.select("[class$='selectable']"):
    sizes.setdefault(tag["class"][0], []).append(
        tag.select_one("a").get_text().strip("n"))

输出:

print(sizes)

# {'unselectable': ['40'], 'selectable': ['41', '42', '43', '44', '45', '46']}

如果您只想在尺寸

40
可用时触发机器人,您可以这样做:

available_sizes = [
    tag["title"].split(": ")[-1]
    for tag in soup.select(".selectable a")
] # ['41', '42', '43', '44', '45', '46']

if "40" in available_sizes:
    ...
© www.soinside.com 2019 - 2024. All rights reserved.