导航到新链接-RVest

问题描述 投票:0回答:1

我正在尝试访问网页上的链接以转到下一个网页。我正在尝试提取有关TN中所有计划生育的信息,并从下面的网页开始

enter image description here

我想知道如何从此webpage开始并导航到诺克斯维尔健康中心的网页。我尝试将rvest软件包与以下内容一起使用...

library(rvest)
library(dplyr)
URL <- paste0("https://www.plannedparenthood.org/health-center/tn")
Webpage <- read_html(URL)
Webpage %>% html_nodes("p")  

哪个给我...

{xml_nodeset (6)}
[1] <p itemprop="name" data-facility-id="2610" data-affiliate-name="Planned Parenthood of Tennessee ...
[2] <p itemprop="name" data-facility-id="3348" data-affiliate-name="Planned Parenthood of Tennessee ...
[3] <p itemprop="name" data-facility-id="4247" data-affiliate-name="Planned Parenthood of Tennessee ...
[4] <p itemprop="name" data-facility-id="2716" data-affiliate-name="Planned Parenthood of Tennessee ...
[5] <p>Planned Parenthood delivers vital reproductive health care, sex education, and information t ...
[6] <p class="site-footer-legal">\n            <small>\n              © 2020 Planned Parenthood Fed ...

不太确定要超越这一点。可以使用任何帮助。

r web-scraping rvest
1个回答
1
投票

您可以使用:来获得指向该网页的链接。

library(rvest)
URL <- "https://www.plannedparenthood.org/health-center/tn"
Webpage <- read_html(URL)

all_links <- Webpage %>% 
               html_nodes("p a") %>%
               html_attr('href') %>%
               paste0('https://www.plannedparenthood.org', .)
all_links
#[1] "https://www.plannedparenthood.org/health-center/tennessee/knoxville/37914/knoxville-health-center-2610-91550"                 
#[2] "https://www.plannedparenthood.org/health-center/tennessee/memphis/38112/memphis-health-center-midtown-3348-91550"             
#[3] "https://www.plannedparenthood.org/health-center/tennessee/memphis/38122/memphis-health-center-near-summer-and-i240-4247-91550"
#[4] "https://www.plannedparenthood.org/health-center/tennessee/nashville/37203/nashville-health-center-2716-91550" 

您现在可以使用这些单独的链接进行进一步导航。

© www.soinside.com 2019 - 2024. All rights reserved.