错误的输出正则表达式

问题描述 投票:0回答:1
import re
import urllib3
url = 'https://bazaartracker.com/search?query=rough+ruby'
def extract_dynamic_numbers_from_url(url):
    http = urllib3.PoolManager()
    response = http.request('GET', url)
    content = response.data.decode('utf-8')

    pattern = r'<span class="text-subtle-100">([\d.]+)\s*coins<\/span>'
    dynamic_numbers = re.findall(pattern, content)

    return dynamic_numbers
print(extract_dynamic_numbers_from_url(url))

当我运行它时,它会打印方括号

我尝试多次更改模式以缩小范围,但这导致出现方括号或没有找到任何内容 我使用 urllib3 将 HTML 抓取到代码中的字符串中,然后使用正则表达式在其中查找内容

python web-scraping
1个回答
0
投票
import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0',
    'Accept': 'application/json, text/javascript, */*; q=0.01',
    'Accept-Language': 'en-US,en;q=0.5',
    'Origin': 'https://bazaartracker.com',
}

def get_prices(item):
    params = {
        'query': item,
    }

    response = requests.get('https://api.bazaartracker.com/search', params=params, headers=headers)

    item = response.json()[0]

    return {
        "name": item["item"]["name"],
        "buy": item["product"]["buyprice"],
        "sell": item["product"]["sellprice"],
    }

结果示例。

print(get_prices("rough ruby"))
print(get_prices("enchanted carrot on a stick"))
print(get_prices("super compactor 3000"))
{'name': 'Rough Ruby Gem', 'buy': 2.3, 'sell': 1.2}
{'name': 'Enchanted Carrot on a Stick', 'buy': 47373.1666666666, 'sell': 5390.97647058823}
{'name': 'Super Compactor 3000', 'buy': 304646.47142857144, 'sell': 271344.9625}
© www.soinside.com 2019 - 2024. All rights reserved.