用Beautiful Soup从YouTube播放列表中刮取曲目链接。

问题描述 投票:0回答:1

我正试图从我的播放列表中搜索所有曲目的链接。

这是我的代码

from selenium import webdriver 
from time import sleep
from bs4 import BeautifulSoup
from urllib.request import urlopen
import re

playlist = 'minimal_house'

url = 'https://www.youtube.com/channel/UCt2GxiTBN_RiE-cbP0cmk5Q/playlists'
html = urlopen(url)
soup = BeautifulSoup(html , 'html.parser')
tracks = soup.find(title = playlist).get('href')

print(tracks)

url = url + tracks
print(url)

html = urlopen(url)

soup = BeautifulSoup(html, 'html.parser')

links = soup.find_all('a',attrs={'class':'yt-simple-endpoint style-scope ytd-playlist-panel-video-renderer'})

print(links)

我刮不到 'a';也不是通过 id也不是按类名。

example of one track from playlist

python web-scraping beautifulsoup
1个回答
0
投票

这是我的乱码,对我来说是可行的。

from selenium import webdriver
from time import sleep
from bs4 import BeautifulSoup
from urllib.request import urlopen
import re

playlist = 'minimal_house'

url = 'https://www.youtube.com/channel/UCt2GxiTBN_RiE-cbP0cmk5Q/playlists'
html = urlopen(url)
soup = BeautifulSoup(html, 'html.parser')
tracks = soup.find('a', attrs={'title': playlist}).get('href')

print(tracks)

url = 'https://www.youtube.com' + str(tracks)
print(url)

html = urlopen(url)

soup = BeautifulSoup(html, 'html.parser')

links = soup.find_all('a')
links = set([link.get('href') for link in links if link.get('href').count('watch')])

print(links)

因为类名会根据设备请求而改变,所以在这种情况下最好是获取所有的链接,你需要用selenium向下滚动来获取所有的列表。

© www.soinside.com 2019 - 2024. All rights reserved.