如何从网站获取需求 html？

Question

感谢您的关注，并为我糟糕的英语感到抱歉。我一直在尝试从 https://www.skiddle.com/festivals/dates.html 获取 html，但没有成功。我知道有些部分是通过js脚本下载的，但我不知道如何获取它。我也尝试过使用“会话”，但结果相同。请告诉我我需要在代码中使用什么或我需要探索什么。

提前致谢！

这是我的代码

import requests
from bs4 import BeautifulSoup
import lxml
from selenium import webdriver
import time
import undetected_chromedriver
import json


headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0'
}
proxies = {
    'https': 'http://146.247.105.71:4827'
}


def get_location(url):
    response = requests.get(url, headers=headers, proxies=proxies)
    soup = BeautifulSoup(response.text, 'lxml')
    print(soup, '\n\n\nlox\n\n\n')

    # options = undetected_chromedriver.ChromeOptions()

    # options.add_argument('--proxy-server=146.247.105.71:4827')

    # driver = undetected_chromedriver.Chrome(
    #     options=options
    # )
    # driver.get(url)
    # time.sleep(5)
    # response = driver.page_source
    # driver.close()
    # driver.quit()
    # print(response)


def main():
    get_location(url='https://www.skiddle.com/festivals/dates.html')


if __name__ == '__main__':
    main()

我需要每个节日页面上的链接。

Answer 1

以下是如何打印节日名称 + URL 的示例：

import requests
from bs4 import BeautifulSoup

url = "https://www.skiddle.com/festivals/dates.html"

soup = BeautifulSoup(requests.get(url).content, "html.parser")
for a in soup.select("li.margin-bottom-10 a"):
    print(f'{a.text:<50} {a["href"]}')

打印：

...

Levitation '24 at Bedford Esquires                 /whats-on/Bedford/Bedford-Esquires/Levitation-24/37157298/
Day at Historic Centreville Park                   /whats-on/united-states/Historic-Centreville-Park/Day/36718089/
When We Were Young at Las Vegas USA                https://www.skiddle.com/festivals/when-we-were-young/
When We Were Young at Las Vegas USA                https://www.skiddle.com/festivals/when-we-were-young/
Damnation Festival 2024 at BEC Arena               https://www.skiddle.com/festivals/damnation/
Hard Rock Hell at Vauxhall Holiday Park            https://www.skiddle.com/festivals/hard-rock-hell/

...

如何从网站获取需求 html？

问题描述投票：0回答：1

1个回答

最新问题

如何从网站获取需求 html？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1