如何在selenium中获取最新的标签响应?

问题描述 投票:0回答:2

所以这是我的机器人:https://www.pandorabots.com/pandora/talk?botid=b3a17e933e345861

我正在尝试获取当前的人类vs thanos响应,所以我尝试:

from selenium import webdriver
import time
driver=webdriver.Chrome()
browser=driver.get('https://www.pandorabots.com/pandora/talk?botid=b3a17e933e345861')
ask=driver.find_element_by_xpath('/html/body/form/table/tbody/tr[1]/td[1]/input')
inpu_1='ask thanos '
ask.send_keys(inpu_1)
time.sleep(2)

但现在我陷入困境,无法找到获取当前人类和thanos响应的方法,因为有很多标记,如果我尝试使用xpath,它看起来像这样:

/ HTML /体/ B [2]

所以,如果我这样做:

print(" thanos: {} ".format(driver.find_element_by_css_selector("b:contains('thanos:')")))

然后它没有给出任何东西并且返回空白

如何获取thanos的最新回复?

python selenium xpath web-scraping beautifulsoup
2个回答
1
投票

如果你观察HTML DOM,最后一个human response总是在top上,跟随它的最后一个thanos response。因此,根据您对find a way to fetch current human and thanos response的问题,您可以使用以下代码块:

full_text = driver.find_element_by_xpath("//body").get_attribute("innerHTML")
one_set_conversation = full_text.split("Human:")
human_thanos = one_set_conversation[1].split("thanos:")
print("Last Human Reply :")
print(human_thanos[0])
print("Last Thanos Reply :")
print(human_thanos[1])

0
投票

您应该只检索正文内容并处理文本内容:

from selenium import webdriver
import time
driver=webdriver.Chrome()
browser=driver.get('https://www.pandorabots.com/pandora/talk?botid=b3a17e933e345861')
ask=driver.find_element_by_xpath('/html/body/form/table/tbody/tr[1]/td[1]/input')
inpu_1='ask thanos '
ask.send_keys(inpu_1)
ask.submit()
content = driver.find_element_by_css_selector('body').text.split("\n")
print(content)
time.sleep(2)

此代码打印:

['Tell thanos:', '  Powered by Pandorabots.', '', 'Human: ask thanos', 'thanos: They are not available right now, but I will ask them later.']

所以最新的回应应该是该列表的第5个元素。

© www.soinside.com 2019 - 2024. All rights reserved.