我是使用 Selenium 进行网页抓取的初学者。我正在尝试打开特定的谷歌个人资料(因为所有网站都已经登录)。我很高兴代码能够打开特定的配置文件窗口(配置文件为default)。但是,它无法在窗口中打开站点并开始抓取。这是代码:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time
# Installed chromedriver.exe
# Changing some arguments to set Chrome profile
options = webdriver.ChromeOptions()
# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
# Location where Chrome stores profiles
# options.add_arguments = {"user-data-dir": r"C:\Users\Kavipriyan\AppData\Local\Google\Chrome\User Data\Default"}
options.add_argument(r"--user-data-dir=C:\Users\Kavipriyan\AppData\Local\Google\Chrome\User Data")
# Profile name
options.add_argument(r"--profile-directory=Default")
service = Service(executable_path="chromedriver.exe")
driver = webdriver.Chrome(options=options, service=service)
driver.get("https://twitter.com/home")
time.sleep(3)
driver.close()
侧面的这段代码也在控制台中给出了这个巨大的错误(我不确定哪些部分是确切需要的,所以我粘贴了它打印出来的所有内容):
Opening in existing browser session.
Traceback (most recent call last):
File "d:\COMPUTER FILES\My Stuff\Code Programs - Python\Twitterscraping 2.0\Twitterscraper 3.2.py", line 23, in <module>
driver = webdriver.Chrome(options=options, service=service)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\chrome\webdriver.py", line 45, in __init__
super().__init__(
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\chromium\webdriver.py", line 61, in __init__
super().__init__(command_executor=executor, options=options)
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\remote\webdriver.py", line 208, in __init__
self.start_session(capabilities)
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\remote\webdriver.py", line 292, in start_session
response = self.execute(Command.NEW_SESSION, caps)["value"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\remote\webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "C:\Users\Kavipriyan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\selenium\webdriver\remote\errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: Chrome failed to start: exited normally.
(session not created: DevToolsActivePort file doesn't exist)
(The process started from chrome location C:\Program Files\Google\Chrome\Application\chrome.exe is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
GetHandleVerifier [0x00007FF667BC7062+63090]
(No symbol) [0x00007FF667B32CB2]
(No symbol) [0x00007FF6679CEC65]
(No symbol) [0x00007FF667A00777]
(No symbol) [0x00007FF6679FB2F4]
(No symbol) [0x00007FF667A40BFB]
(No symbol) [0x00007FF667A40830]
(No symbol) [0x00007FF667A36D83]
(No symbol) [0x00007FF667A083A8]
(No symbol) [0x00007FF667A09441]
GetHandleVerifier [0x00007FF667FC25CD+4238301]
GetHandleVerifier [0x00007FF667FFF72D+4488509]
GetHandleVerifier [0x00007FF667FF7A0F+4456479]
GetHandleVerifier [0x00007FF667CA05A6+953270]
(No symbol) [0x00007FF667B3E57F]
(No symbol) [0x00007FF667B39254]
(No symbol) [0x00007FF667B3938B]
(No symbol) [0x00007FF667B29BC4]
BaseThreadInitThunk [0x00007FFDF5047344+20]
RtlUserThreadStart [0x00007FFDF56426B1+33]
提前致谢!
提示Chrome无法正常启动。
这可能是由于多种原因造成的,例如 Chrome、ChromeDriver 和 Selenium 之间的兼容性问题。
尝试一下
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
import time
# Specify the path to your ChromeDriver executable
chrome_driver_path = "chromedriver.exe"
# Set Chrome options
options = webdriver.ChromeOptions()
# Specify the path to the user data directory
options.add_argument(r"--user-data-dir=C:\Users\Kavipriyan\AppData\Local\Google\Chrome\User Data")
# Specify the profile directory
options.add_argument(r"--profile-directory=Default")
# Add other optional arguments if needed
# options.add_argument("--headless") # Uncomment if you want to run Chrome in headless mode
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
# Initialize the Chrome service and driver
service = Service(executable_path=chrome_driver_path)
driver = webdriver.Chrome(service=service, options=options)
# Load the desired URL
driver.get("https://twitter.com/home")
time.sleep(3)
# Do your scraping here...
# Close the browser session
driver.quit()