当数据集中有零时填写 NA

问题描述 投票:0回答:1

假设您有以下数据框:

df <- data.frame(year=c(rep(2010,12),rep(2011,12),rep(2012,12)),
                country=c(rep("DEU",4),rep("ITA",4),rep("USA",4),
                          rep("DEU",4),rep("ITA",4),rep("USA",4),
                          rep("DEU",4),rep("ITA",4),rep("USA",4)),
                industry=c(rep(1:4,9)),
                stock1=c(rep(0,24),0,0,2,4,1,0,1,2,3,3,3,5),
                stock2=c(rep(0,24),0,3,3,4,5,0,1,1,2,2,2,5))

并且您希望获得以下结果:

df2 <- data.frame(year=c(rep(2010,12),rep(2011,12),rep(2012,12)),
                 country=c(rep("DEU",4),rep("ITA",4),rep("USA",4),
                           rep("DEU",4),rep("ITA",4),rep("USA",4),
                           rep("DEU",4),rep("ITA",4),rep("USA",4)),
                 industry=c(rep(1:4,9)),
                 stock1=c(rep(NA,24),0,0,2,4,1,0,1,2,3,3,3,5),
                 stock2=c(rep(NA,24),0,3,3,4,5,0,1,1,2,2,2,5))

这个概念是,如果在特定年份,某个特定国家/地区报告所有行业的库存 2 为零,则这些零应在库存 1 和库存 2 中替换为 NA(不可用)。我的尝试如下

library(dplyr)
df2 = df %>%
  group_by(country, year, industry) %>%
  mutate(
    stock1 = ifelse(all(stock2 == 0), NA, stock1),
    stock2 = ifelse(all(stock2 == 0), NA, stock2)
  )

谢谢!

r database dataframe dplyr
1个回答
0
投票

你可以尝试这个方法:

df %>% 
  mutate(
    ind = all(stock2==0),
    across(stock1:stock2, ~if_else(ind, NA,.)),
    .by = c(year, country)) %>% select(-ind)
© www.soinside.com 2019 - 2024. All rights reserved.