RSelenium:点击其他链接中的链接

问题描述 投票:0回答:1

我有这个RSelenium脚本:

library(tidyverse)
library(RSelenium) # running through docker
library(rvest)
library(httr)

remDr <- remoteDriver(port = 4445L, browserName = "chrome")
remDr$open()


remDr$navigate("https://books.google.com/")
books <- remDr$findElement(using = "css", "[name = 'q']")

books$sendKeysToElement(list("NHL teams", key = "enter"))

bookElem <- remDr$findElements(using = "xpath",
                               "//h3[@class = 'LC20lb']//parent::a")

links <- sapply(bookElem, function(bookElem){
  bookElem$getElementAttribute("href")
})

以上点击了Google搜索返回的每个链接(每页有10个)。一旦你点击它,我搜索的书籍大多都有预览。如果有预览,有一个小的About this book链接点击,它带你到发布信息。

我想点击第一个链接,然后如果有预览,请点击“关于这本书”。我有以下,但我只是得到Error: object of type 'closure' is not subsettable错误:

for(link in links) {

  # Navigate to each link
  remDr$navigate(link)

  # If statement to get past book previews
  if (str_detect(link, "frontcover")) {

   link2 <- remDr$findElement(using = 'xpath', 
                               '//*[@id="sidebar-atblink"]//parent::a')
   link2 <- as.list(link2)
   print(class(link2))
   link2_about <- sapply(link2, function(ugh){
      ugh$getElementAttribute('href')
    })

  } else {
    print("nice going, dumbass")
  }
}

或者我尝试使用for循环而不是sapply,我得到Error: $ operator is invalid for atomic vectors

for(link in links) {

  # Navigate to each link
  remDr$navigate(link)

  # If statement to get past book previews
  if (str_detect(link, "frontcover")) {

    link2 <- remDr$findElement(using = 'xpath',
       '//a[@id="sidebar-atb-link" and span[.="About this book"]]')

     for(i in length(link2)){
      i$getElementAttribute('href')
     }

    } else {
     print("dumbass")
   }
}

如何成功点击第二个链接,具体取决于预览是否存在?谢谢!

r selenium web-scraping dplyr rselenium
1个回答
1
投票

只需更新以下行。

aboutLinks <- remDr$findElements(using = 'xpath', 
                           '//a[@id="sidebar-atb-link" and span[.="About this book"]]')
links2 <- sapply(aboutLinks, function(about_link){
  about_link$getElementAttribute('href')
})
© www.soinside.com 2019 - 2024. All rights reserved.