组织数据集并对编辑后的数据进行分类[关闭]

问题描述 投票:0回答:1
df <- read.csv ('https://raw.githubusercontent.com/ulklc/covid19- 
timeseries/master/countryReport/raw/rawReport.csv',
            stringsAsFactors = FALSE)

# I did wrong.
df11 <- aggregate(df$confirmed, by=list(df$countryName) subset(df,df$confirmed < df$recovered) , FUN 
== "max"))

在国家内查找回收数量通过确认的日期。

作为输出。

day              countryName        confirmed     recovered 
2020/04/10         Spain              1500          1550
2020/01/19         italy              862            900
...

数据是例子,不是实际值。

r
1个回答
0
投票

没有一个国家的收复数量超过案件数量(截至目前)。这一点得到了证实与。

library(dplyr)

df %>%
  mutate(day=as.Date(day)) %>%
  group_by(countryName) %>%
  filter(confirmed<recovered) %>%
  top_n(1, wt=day)

# A tibble: 0 x 9

现在让我们假设明天,美国所有的病例都奇迹般地恢复了 而且没有新的病例发生;)

tomorrow <- read.table(text="
             day countryCode   countryName   region lat lon confirmed recovered death
21379 2020/05/09          US United States Americas  38 -97   1187233   1187234 68566", header=TRUE)

让我们把这一行添加到数据中,再运行代码。

df2 <- rbind(df, tomorrow)
df2 %>%
  mutate(day=as.Date(day)) %>%
  group_by(countryName) %>%
  filter(confirmed<recovered) %>%
  top_n(1, wt=day)

#  A tibble: 1 x 9
#  Groups:   countryName [1]
#   day        countryCode countryName   region     lat   lon confirmed recovered death
#   <date>     <chr>       <chr>         <chr>    <dbl> <dbl>     <int>     <int> <int>
# 1 2020-05-09 US          United States Americas    38   -97   1187233   1187234 68566
© www.soinside.com 2019 - 2024. All rights reserved.