我正在运行这个功能:
require(XML)
require(plyr)
getKeyStats_xpath <- function(symbol) {
yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")
#search for <td> nodes anywhere that have class 'yfnc_tablehead1'
nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")
if(length(nodes) > 0 ) {
measures <- sapply(nodes, xmlValue)
#Clean up the column name
measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))
#Remove dups
dups <- which(duplicated(measures))
#print(dups)
for(i in 1:length(dups))
measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")
#use siblings function to get value
values <- sapply(nodes, function(x) xmlValue(getSibling(x)))
df <- data.frame(t(values))
colnames(df) <- measures
return(df)
} else {
break
}
}
只要页面存在,它就可以正常工作。但是,如果我的某个代码没有该URL的任何数据,则会引发错误:
Error in FUN(X[[3L]], ...) : no loop for break/next, jumping to top level
我也添加了一条痕迹,并且在3号代码上发生了故障。
tickers <- c("QLTI",
"RARE",
"RCPT",
"RDUS",
"REGN",
"RGEN",
"RGLS")
tryCatch({
stats <- ldply(tickers, getKeyStats_xpath)
}, finally={})
我想调用这样的函数:
stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)
基本上,如果股票代码没有数据,我想跳过它。
有人可以帮我搞定吗?