[我正在做一个学校项目,我正在使用LXML,它是.xpath函数来尝试在您可以选择的youtube搜索中获取热门视频的标题。我的问题是,当它遍历前5名并返回视频的标题值时,无论我做什么,我似乎都无法返回实际标题。我尝试执行/text()
或/string
或/title/text()
,因为我要获取的文本位于标题中,但是我所做的一切都只是返回一个空白列表[]
。
这是我的python代码:
from lxml import html
import requests
string = input("Enter what you want to search up on Youtube: \n")
string.replace(" ", "+")
page = requests.get('https://www.youtube.com/results?search_query=' + string)
tree = html.fromstring(page.content)
for x in range(5):
v = tree.xpath('/html/body/ytd-app/div/ytd-page-manager/ytd-search/div[1]/ytd-two-column-search-results-renderer/div/ytd-section-list-renderer/div[2]/ytd-item-section-renderer/div[3]/ytd-video-renderer[1]/div[1]/div/div[' + str(x) + ']/div/h3/a')
print(v)
这是我要返回的东西:
Enter what you want to search up on Youtube:
rainbow
[]
[]
[]
[]
[]
这是我要从中提取TITLE TEXT的内容的HTML:
<a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" title="Hide and Seek in Rainbow Six Siege... Let's Go!!" href="/watch?v=g8MM_RS7zmw" aria-label="Hide and Seek in Rainbow Six Siege... Let's Go!! by Get_Flanked 8 hours ago 21 minutes 54,654 views">
Hide and Seek in Rainbow Six Siege... Let's Go!!
</a>
这是我第一次创建其中的一个,我只是一个学生,所以如果我格式化不正确或做错了什么,请放轻松。感谢您的帮助!
考虑使用youtube数据API,他们确实有python库。
否则,如果您想使用某种类型的刮板,则需要一个可以执行javascript的刮板。 requests
仅下载html文本文件,不运行javascript。
例如硒。
import selenium.webdriver
options = selenium.webdriver.FirefoxOptions()
options.add_argument("--headless")
driver = selenium.webdriver.Firefox(firefox_options=options)
driver.get('https://www.youtube.com/results?search_query=montypython')
[x.text for x in driver.find_elements_by_xpath('//*[@id="video-title"]')]
[x.text for x in driver.find_elements_by_id('video-title')]
print(dir(driver))
# how to get html tag attributes for example href
x.get_attribute("href")
>>> [x.get_attribute('title') for x in driver.find_elements_by_id('video-title')]
['Monty Python And The Holy Grail 1975 HD', 'Monty Python and the Holy Grail', "Monty Python's - The Funniest Joke in the World (la blague qui tue)", 'Argument', 'Monty Python - The Black Knight - Tis But A Scratch', 'Monty Python- Cheese Shop', 'Monty Python: The Parrot Sketch & The Lumberjack Song movie versions HQ', 'Biggus Dickus - Monty Python, Life of Brian.', 'Monty Python - Bridge of Death', 'Life of Brian 1979 (sub indo)', 'John Cleese - How To Irritate People 1968', 'Monty Python and The Holy Grail - Black Knight HD', 'Eric Idle - "Always Look On The Bright Side Of Life" - STEREO HQ', 'Monty pythons, Mr creosote, Full version,', 'Monty Python Ministry of Silly Walks NL', 'Monty Python - careers advice', 'Monty Python and the Holy Grail - Bunny Attack Scene (HD)', 'Monty Python Society For Putting Things On Top of Other Things', 'Monty Python - Constitutional Peasants Scene (HD)']
另请参见:https://stackoverflow.com/help/how-to-ask和https://stackoverflow.com/tour
只要您的问题显示出一定的努力并且清晰明了,您的问题就可能会或可能不会找到答案,这取决于其他人是否能够理解所问的内容并有时间回答。