我练习将网站上的表格转移到 R 中。感觉每个网站都需要自己独特的策略来做到这一点。我有一些,但我被这个难住了:https://www.cbssports.com/fantasy/football/rankings/ppr/top200/
您将如何将这些表中的一个(或全部四个)放入 R 中,大概使用 rvest 包。
library(rvest)
url <- "https://www.cbssports.com/fantasy/football/rankings/ppr/top200/"
# get the html
read_html(url)->html
# create the dataframe
data.frame(
rank = html_nodes(html, ".rank") |> html_text2(),
name = html_nodes(html, ".player-name") |> html_text2(),
team_postion_cost = html_nodes(html, ".team") |> html_text2(),
bye = html_nodes(html, ".player-stats") |> html_text2()
) %>%
# add the author column
{mutate(., authors = rep(authors, each = nrow(.)/length(authors)))} |>
# separate the team_position_cost column into the separate columns
separate_wider_delim(team_postion_cost, delim = " ", names = c("team", "position", "cost")) |>
mutate(
position = as.factor(position),
team = as.factor(team),
cost = as.integer(str_remove(cost, "\\$")),
bye = as.integer(bye))
输出:
# A tibble: 800 × 7
rank name team position cost bye authors
<chr> <chr> <fct> <fct> <int> <int> <chr>
1 1 J. Jefferson MIN WR 34 13 Consensus
2 2 J. Chase CIN WR 33 7 Consensus
3 3 C. McCaffrey SF RB 34 9 Consensus
4 4 A. Ekeler LAC RB 31 5 Consensus
5 5 T. Hill MIA WR 30 10 Consensus
6 6 C. Kupp LAR WR 30 10 Consensus
7 7 B. Robinson ATL RB 27 11 Consensus
8 8 S. Diggs BUF WR 26 13 Consensus
9 9 T. Kelce KC TE 23 10 Consensus
10 10 S. Barkley NYG RB 26 13 Consensus
# ℹ 790 more rows