刮取Github提交作者元素

问题描述投票：0回答：1

任何html高手都可以提取这个链接上元素的文本。https:/github.comtidyverseggplot2。

所需的元素文本是

我目前在r中使用rvest，尝试过xpath、css等，但就是无法提取用户名。如果需要的话，我很乐意提取一个包含用户名的链接，并使用regex来清理文本。

任何帮助非常感激。

html r github web-scraping rvest

1个回答

1
投票

library(rvest)

read_html("https://github.com/tidyverse/ggplot2") %>%
  html_nodes(".user-mention") %>% 
  html_text()

# [1] "thomasp85"

但如果你想从多个仓库中抓取信息，你可能要考虑使用官方的 GitHub REST API 或这个轻量级的R包客户端.

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.