我正在尝试抓取广播电台网站以获取当前的图表(https://www.energy.de/programm/energy-euro-hot-30然后https://music.apple.com /de/playlist/energy-euro-hot-30/pl.9b672a18307c4cd7ba1ece0106891868)。我正在使用 Python 和 Requests HTML 模块。当我分析请求提供的HTML代码时,我可以分析的元素不包括在内。但是,如果我检查浏览器中显示的页面,我会找到所需的数据。我在本周初遇到了类似的问题,当时一位用户(https://stackoverflow.com/users/10035985/andrej-kesely)帮助了我。 他使用 Chrome Devtools 及其网络选项卡来查找正确的链接来访问所需的数据。我现在已经自己尝试过这个方法来解决我当前的问题,但我完全被大量的连接淹没了。也许有人可以将我推向正确的方向......
我尝试使用 Chrome Devtools 及其“网络”选项卡来查找正确的链接来获取我需要的数据。我没有成功。
您在“网络”选项卡中看不到任何内容,因为数据存储在页面中的
<script>
元素内。这是一个如何解析它的示例:
import json
import requests
from bs4 import BeautifulSoup
def find_tracks(o):
if isinstance(o, dict):
if o.get("itemKind") == "trackLockup":
yield o["items"]
return
for v in o.values():
yield from find_tracks(v)
elif isinstance(o, list):
for v in o:
yield from find_tracks(v)
url = "https://music.apple.com/de/playlist/energy-euro-hot-30/pl.9b672a18307c4cd7ba1ece0106891868"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = json.loads(soup.select_one("#serialized-server-data").text)
tracks = next(find_tracks(data))
# print(json.dumps(tracks, indent=4))
for track in tracks:
print(f'{track["title"]:<55} {track["artistName"]}')
打印:
Overdrive (feat. Norma Jean Martine) Ofenbach
Houdini Dua Lipa
Strangers Kenya Grace
When We Were Young (The Logical Song) David Guetta & Kim Petras
greedy Tate McRae
Gimme Love Sia
Lose Control Teddy Swims
Cynical twocolors, Safri Duo & Chris de Sarandy
Lovin On Me Jack Harlow
Si No Estás Iñigo Quintero
Paint The Town Red Doja Cat
Water Tyla
On My Love Zara Larsson & David Guetta
Is It Love Loreen
I'll Be There Robin Schulz, Rita Ora & Tiago PZK
Dreaming Marshmello, P!nk & Sting
American Town Ed Sheeran
Is It Over Now? (Taylor's Version) [From The Vault] Taylor Swift
Better Me Michael Schulte & R3HAB
Mwaki ZERB
Substitution (feat. Julian Perretta) Purple Disco Machine & Kungs
RUNAWAY OneRepublic
Blindside James Arthur
Dive Lost Frequencies & Tom Gregory
Tattoo Loreen
LOVE'n'TENDRESSE Eddy de Pretto
Prada cassö, RAYE & D-Block Europe
Never Give Up Puggy
Used To Be Young Miley Cyrus
Seasons Thirty Seconds to Mars