df <- read.csv ('https://raw.githubusercontent.com/ulklc/covid19-
timeseries/master/countryReport/raw/rawReport.csv',
stringsAsFactors = FALSE)
# I did wrong.
df11 <- aggregate(df$confirmed, by=list(df$countryName) subset(df,df$confirmed < df$recovered) , FUN
== "max"))
在国家内查找回收数量通过确认的日期。
作为输出。
day countryName confirmed recovered
2020/04/10 Spain 1500 1550
2020/01/19 italy 862 900
...
数据是例子,不是实际值。
没有一个国家的收复数量超过案件数量(截至目前)。这一点得到了证实与。
library(dplyr)
df %>%
mutate(day=as.Date(day)) %>%
group_by(countryName) %>%
filter(confirmed<recovered) %>%
top_n(1, wt=day)
# A tibble: 0 x 9
现在让我们假设明天,美国所有的病例都奇迹般地恢复了 而且没有新的病例发生;)
tomorrow <- read.table(text="
day countryCode countryName region lat lon confirmed recovered death
21379 2020/05/09 US United States Americas 38 -97 1187233 1187234 68566", header=TRUE)
让我们把这一行添加到数据中,再运行代码。
df2 <- rbind(df, tomorrow)
df2 %>%
mutate(day=as.Date(day)) %>%
group_by(countryName) %>%
filter(confirmed<recovered) %>%
top_n(1, wt=day)
# A tibble: 1 x 9
# Groups: countryName [1]
# day countryCode countryName region lat lon confirmed recovered death
# <date> <chr> <chr> <chr> <dbl> <dbl> <int> <int> <int>
# 1 2020-05-09 US United States Americas 38 -97 1187233 1187234 68566