curl::curl_fetch_memory（url，handle = handle）中的错误：无法解析主机：

Question

我的 R 代码（见下文）在某些情况下会生成这些错误：

[1] "2023-08-12 16:47:37.463"
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1.3 seconds...
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1 seconds...
Error in curl::curl_fetch_memory(url, handle = handle):
Could not resolve host: api.abc.com

api.abc.com 不是我使用的原始 API。我使用一个商业 API，它注意到他们的服务器在上面的特定时刻没有关闭。在某些情况下，当服务器关闭时，它会返回 http 代码 503。

我有两个问题：

这些错误的原因可能是什么？
如果出现这些错误，如何让下面的脚本继续运行？目前，它在这些错误消息后中断。我没想到会这样，因为我在代码中使用了
```
RETRY
```
和
```
GET
```
。

我的下面的代码每 10 秒就会被调度程序调用一次

tclTaskSchedule

（参见代码末尾）。在此示例代码中，我使用了免费 API (universities.hipolabs.com) 作为示例。

library(httr) # accessing API's'
library(jsonlite) # JSON parsing
library(dplyr)
library(readr)
library(purrr)
library(tidyr)
library(stringr)
library(tibble)
library(tcltk2)
library(lubridate)

run_api_once <- function() {
  
  mydatalist <- list() #create an empty list
  
  my_next_page_with_number <- "http://universities.hipolabs.com/search?country=United+States"

    mydata1 <- RETRY("GET", my_next_page_with_number)
    
    if(mydata1$status_code != 200){
      print(mydata1$status_code)
      http_responses <<- append(http_responses, paste(mydata1$status_code, Sys.time()))
      has_more_pages <- FALSE
      
    } else {
      
      rawdata <- rawToChar(mydata1$content)
      mydata2 <- fromJSON(rawdata, flatten = FALSE, simplifyVector = FALSE)
      
      mydata <- mydata2
      
      mydatalist <- c(mydatalist, mydata)
    }
  
  
    y <- Sys.time()
    y <- format(y, "%Y-%m-%d %H:%M")
    print(y)
    
  users <- tibble(user = mydatalist)
  myvar <<- users %>% unnest_wider(user) 

return(myvar)
  
}


# call function every 10 seconds:
tclTaskSchedule(10000, run_api_once(), id = "run_api_once", redo = TRUE)

# end session:
tclTaskDelete(NULL)

我认为这是无关紧要的，尽管为了完整性：我使用 Plumber 将 myvar 的内容流式传输到我的电脑上的本地服务器。请参阅下面的代码：

# stream df myvar to local api at port 8405:
library(plumber)
pr("D:/plumber_universities2test.R") %>%
# pr("C:/plumber_universities2test.R") %>%
  pr_run(port=8405)

这调用了这个脚本：

library(plumber)
library(dplyr)

#* @param symbol Ticker symbol (just to input something in the function)
#* @get /return
#* @serializer json list(na="string")

universities_data <- function(symbol) {
  data <- myvar
  data 
}

非常感谢！

Answer 1

回答您的问题：

有几个可能的原因：您没有连接到互联网；您的防火墙妨碍并阻止
```
httr
```
；或者您正在向无效 URL 发出请求。如果没有看到您发出请求的实际 URL，我无法确定，但我猜第三个选项最有可能。您应该检查在粘贴特定 URL 时是否犯了错误。例如
```
"google.comsearch"
```
而不是
```
"google.com/search"
```
```
RETRY
```
之所以没有按照你期望的方式运行，是因为这不是服务器返回的HTTP错误状态，而是你的请求根本无法执行。为了演示其中的差异，让我们看一下一个简单函数的行为，该函数向 URL 发出请求，该 URL 自动返回 HTTP 错误，而该 URL 根本不存在：

library(httr)

test_fun <- function(u) {
  RETRY("GET", u, times = 2)
  print("still running")
}

# response contains error
test_fun("https://httpbin.org/status/429")
#> Request failed [429]. Retrying in 1 seconds...
#> [1] "still running"

# no repsonse since there is no server at `test.coms`
test_fun("test.coms")
#> Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: test.coms
#> Request failed [ERROR]. Retrying in 1 seconds...
#> Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: test.coms

^{创建于 2023-08-13，使用 reprex v2.0.2}

如您所见，第一个示例仍然执行函数的其余代码，而第二个示例因错误而停止。我建议仔细检查为什么请求没有到达服务器，如果您确定没有更好的方法，您可以将

try

包裹在

RETRY

上：

mydata1 <- try(RETRY("GET", my_next_page_with_number))
if (is(mydata1, "try-error")) mydata1 <- list(status_code = 404)
if(mydata1$status_code != 200){
  # your code ...
}

但在我看来，

RETRY

的行为是正确的，因为它并不是简单地忽略代码或互联网配置中可能存在的错误（不是服务器端问题）。

curl::curl_fetch_memory（url，handle = handle）中的错误：无法解析主机：

问题描述投票：0回答：1

1个回答

最新问题

curl::curl_fetch_memory（url，handle = handle）中的错误：无法解析主机：

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1