试图将一个python项目(使用selenium来抓取Twitter推文而不使用有限的twitter api)转换为R编程。在Python中可以正常工作,但我想在R中重新创建它。对R来说,但是如果它有帮助我有一些MatLab经验
install.packages("RSelenium") # install RSelenium 1.7.1
据我所知,包已更新。因此,我需要使用其他功能而不是startserver()。但基于所有的研究,我得到的答案略有冲突,但都不起作用:
require(RSelenium) #used require() and library()
remDr <- remoteDriver(browserName = "chrome")
remDr$open()
我收到错误:
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4444: Connection refused
还尝试过:
require(RSelenium)
remDr <- rsDriver(browser = c("chrome"))
我得到:
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
[1] "Connecting to remote server"
Chrome浏览器(61.0.3163.100)启动但由于最后一行,我无法运行下一行代码。浏览器在自我关闭前保持打开大约半分钟,我收到此错误:
Selenium message:unknown error: unable to discover open pages
(Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d9023f),platform=Windows NT 6.1.7601 SP1 x86_64) (WARNING: The server did not provide any stacktrace information)
Command duration or timeout: 60.44 seconds
Build info: version: '3.6.0', revision: '6fbf3ec767', time: '2017-09-27T16:15:40.131Z'
System info: host: 'RENTEC-THINK', ip: '192.168.56.1', os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.8.0_144'
Driver info: driver.version: unknown
Error: Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
Further Details: run errorDetails method
我尝试过多种不同的东西,包括下载chrome驱动程序(v2.33应支持chrome v60-62 https://sites.google.com/a/chromium.org/chromedriver/downloads)并在removeriver中包含路径或将路径添加为系统变量
这就像我做的任何事情都不起作用,就好像RSelenium的更新搞砸了一切。我做了些蠢事吗?
从我在网上看到的所有不一致的答案开始,我已经达到了这样的程度:我发现自己尝试了不同代码行的不同组合,混合了一切等等,试图通过试验和单独的错误
我的下一次尝试是试图找出R安装RSelenium然后看到代码中的内容:(
我也在考虑docker,但我并不是真的要安装单独的应用程序来让我的代码工作。
尝试:
remDr <- remoteDriver(browserName = "chrome")
Sys.sleep(5)
remDr$open()
有时驱动程序尝试打开太快,您收到“无法连接到localhost端口4444:连接被拒绝”错误。
以下对我有用。注意浏览器,selenium和驱动程序版本......
wdman::selenium(port = 4444L, geckover = "0.24.0",
version = "3.141.59",check=FALSE, retcommand = TRUE) %>%
system(wait=FALSE, invisible=FALSE)
rmDrv = remoteDriver(extraCapabilities = list(marionette = TRUE),
browserName="firefox", port = 4444L)
rmDrv$open()
rmDrv$navigate("https://www.google.com")
rmDrv%close()