Selenium无法在无头模式下通过xpath获取元素Ubuntu 18.04

问题描述 投票:1回答:1

我想在Ubuntu服务器上部署我的Python脚本,并通过cron对其进行调用。在我的本地Windows计算机上,我尝试了headless,它运行良好,甚至可以截取屏幕截图。但是在服务器上运行脚本会导致错误,例如找不到元素。有人可以告诉我这是怎么回事吗?

错误重现:

File "DomainScraper.py", line 30, in <module>
    login = driver.find_element_by_xpath('//a[@href="'+login_url+'"]').click()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
    'value': value})['value']
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//a[@href="https://account.domaintools.com/log-in/?r=https%3A%2F%2Freversewhois.domaintools.com%2F%3Frefine"]"}
  (Session info: headless chrome=81.0.4044.17)

我的代码,我在服务器中部署的代码:

#imports...

options = Options()
options.add_argument('--incognito')
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-extensions')
options.add_argument('--disable-infobars')
options.add_argument('--allow-running-insecure-content')
driver = webdriver.Chrome(options=options)
driver.delete_all_cookies()

driver.implicitly_wait(3)
url = "https://reversewhois.domaintools.com/?refine#q=%5B%5B%5B%22whois%22%2C%222%22%2C%22VerifiedID%40SG-Mandatory%22%5D%5D%5D"
driver.get(url)
driver.save_screenshot("sample.png")
login_url = 'https://account.domaintools.com/log-in/?r=https%3A%2F%2Freversewhois.domaintools.com%2F%3Frefine'
login = driver.find_element_by_xpath('//a[@href="'+login_url+'"]').click()

username = driver.find_element_by_id("username")
password = driver.find_element_by_id("password")
username.send_keys("**********************")
password.send_keys("***************")
# time.sleep(5)
driver.find_element_by_id("password").send_keys(Keys.ENTER)

pageNumber = 0
while True:
    driver.implicitly_wait(3)
    driver.get('https://reversewhois.domaintools.com/?ajax=mReverseWhois&call=ajaxGetPreviewPage&q=%5B%5B%5B%22whois%22%2C%222%22%2C%22VerifiedID%40SG-Mandatory%22%5D%5D%5D&o='+str(pageNumber))
    time.sleep(3)
    pre = driver.find_element_by_tag_name("pre").text
    data = json.loads(pre)
    if data['body']:
        table = data['body']
        tables = pd.read_html(table,skiprows=1)
        df = tables[-1]
        df.to_csv('Domains.csv', mode='a', sep=',',index=False)
        print(df.to_string(index=False))
        pageNumber += 1
        # print(pageNumber)
        continue
    else:
        break

更新:

尝试使用并安装两个库

sudo apt install -y xvfb
pip install pyvirtualdisplay

并在启动Chrome之前添加了此内容

from pyvirtualdisplay import Display

display = Display(visible=0, size=(800, 600))
display.start()

似乎根本不起作用。我拍了屏幕截图,并得到以下输出:

[当我不使用xvfb库时,我只会得到白屏。enter image description here

我认为Selenium无法打开URL。我该怎么办?

python selenium selenium-webdriver web-scraping selenium-chromedriver
1个回答
1
投票

您确定该元素可见吗?我注意到默认的Windows大小在无头模式下与正常模式不同。

您可以尝试使用以下方法更改窗口大小:

options.add_argument('window-size=1200x1040')
© www.soinside.com 2019 - 2024. All rights reserved.