解析Youtube播放列表的HTML

问题描述 投票:2回答:4

我在解析youtube播放列表的HTML时遇到问题。例如,当我检查“https://www.youtube.com/playlist?list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_”的标签时。我看到类名“yt-simple-endpoint.style-scope.ytd-playlist-video-renderer”。但是当我使用bs4选择元素时,这不起作用。但是,我在网上发现了另一段工作代码,它选择了以下课程“pl-video-title-link”。但我无法在网页上找到这个类,并且没有一个标签有这个类?附件是工作代码。任何帮助,将不胜感激。

from bs4 import BeautifulSoup as bs
import requests
r = requests.get('https://www.youtube.com/playlist? 
list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_')
page = r.text
soup = bs(page,'html.parser')
res = soup.find_all('a',{'class':'pl-video-title-link'})
for l in res:
print (l.get("href"))
python scripting beautifulsoup
4个回答
1
投票

此页面使用JavaScript更改其结构,但您可以在下载时打印汤,并查看视频链接最初的位置。在这种情况下,在标签<tr>中使用类pl-video

from bs4 import BeautifulSoup
import requests

url = 'https://www.youtube.com/playlist?list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')

for i, tr in enumerate(soup.select('tr.pl-video')):
    print('{}. {}'.format(i + 1, tr['data-title']))
    print('https://www.youtube.com' + tr.a['href'])
    print('-' * 80)

打印:

1. Shell Scripting Tutorial for Beginners 1 -  Introduction
https://www.youtube.com/watch?v=cQepf9fY6cE&list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_&index=2&t=0s
--------------------------------------------------------------------------------
2. Shell Scripting Tutorial for Beginners 2 - using Variables and Comments
https://www.youtube.com/watch?v=vQv4W-JfrmQ&list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_&index=3&t=0s
--------------------------------------------------------------------------------
3. Shell Scripting Tutorial for Beginners 3 - Read User Input
https://www.youtube.com/watch?v=AcSkkNAsGCY&list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_&index=4&t=0s
--------------------------------------------------------------------------------

... all the way to:

32. How Install VirtualBox Guest Additions on Ubuntu 18.04 Guest / virtual machine
https://www.youtube.com/watch?v=qNecdUsuTPw&list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_&index=33&t=0s
--------------------------------------------------------------------------------
33. How to install Java JDK 10 on Ubuntu 18.04 LTS (Debian Linux)
https://www.youtube.com/watch?v=4RJ60fqeTN4&list=PLS1QulWo1RIYmaxcEqw5JhK3b-6rgdWO_&index=34&t=0s
--------------------------------------------------------------------------------

0
投票

试试这个:

<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
  (adsbygoogle = window.adsbygoogle || []).push({
    google_ad_client: "ca-pub-3028420268489959",
    enable_page_level_ads: true
  });
</script>

-1
投票
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
  (adsbygoogle = window.adsbygoogle || []).push({
    google_ad_client: "ca-pub-9888657827081883",
    enable_page_level_ads: true
  });
</script>

-2
投票
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
  (adsbygoogle = window.adsbygoogle || []).push({
    google_ad_client: "ca-pub-4293441101275232",
    enable_page_level_ads: true
  });
</script>
© www.soinside.com 2019 - 2024. All rights reserved.