R功能，使用URL下载XLS文件

Question

[在另一篇文章中收到蒙哥马利·克利夫特（Montgomery Clift）的回答后（请参阅here），我尝试编写一个函数，以便在一个月跨度内循环浏览多天，以从“棒球募集说明书”中收集数据（示例页here）。该代码成功下载了每天的文件，但是随后出现以下错误：

Error in list_to_dataframe(res, attr(.data, "split_labels"), .id, 
id_as_factor) : Results must be all atomic, or all data frames

功能代码，然后是我正在尝试尝试收集的所有数据：

fetch_adjusted <- function(day) {
    fname <- paste0(“standings201909”, day, “.html”)
    download.file(url = 
paste0(“https://legacy.baseballprospectus.com/standings/index.php? 
odate=2019-09-“, day), destfile=fname)
    doc0 <- htmlParse(file=fname, encoding=“UTF-8”)
    doc1 <- xmlRoot(doc0)
    doc2 <- getNodeSet(doc1, “//table[@id=‘content’]”)
    standings <- readHTMLTable(doc2[[1]], header=TRUE, skip.rows=1, 
stringsAsFactors=FALSE)
    standings <- standings[[1]]
    standings$day <- day
    standings
}

Sept <- ldply(1:29, fetch_adjusted, .progress="text")

任何人都可以帮忙弄清楚如何调整我的当前代码，从而避免出现任何错误吗？谢谢！

UPDATE：

我现在能够执行以下操作，从一个范围内的多个日期成功下载xls文件：

dates <- seq(as.Date("2019-09-01"), as.Date("2019-09-30"), by=1)

fetch_adjusted <- function(dates) {
 url <- 
 paste0("https://legacy.baseballprospectus.com/standings/index.php? 
 odate=", dates, "&otype=xls")
 destfile <- "test.xls"
 download.file(url, destfile, mode = "wb")
}

但是现在，无论我使用哪种模式（“ w”，“ wb”，“ a”），它都不会附加文件，所以我最终得到的只是最后一个文件（在这种情况下，2019-09- 30），这是一个空的电子表格。我的想法是，每次都只会覆盖最近的文件。有解决方案吗？

Answer 1

Per Karthik在上面的评论中，以下是the俩：

dates <- seq(as.Date("2019-09-01"), as.Date("2019-09-30"), by=1) fetch_adjusted <- function(dates) { url <- paste0("https://legacy.baseballprospectus.com/standings/index.php?odate=", dates, "&otype=xls") destfile <- paste0("/Desktop/Test/", dates, ".xls") download.file(url, destfile, mode = "wb") } Sept <- ldply(dates, fetch_adjusted, .progress = "text")

R功能，使用URL下载XLS文件

问题描述投票：1回答：1

1个回答

最新问题

R功能，使用URL下载XLS文件

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1