连续两年根据某些值在 R 中创建新变量

Question

我正在尝试使用 UCDP 战斗相关死亡数据集，称为 BattleDeaths_v22_1_conf，来自 https://ucdp.uu.se/downloads/（请参阅 UCDP 战斗相关死亡数据集版本 23.1）

我想创建一个新的变量或数据集，其中仅包含连续 2 年有 1000 人因战争死亡的国家 - 并且仅在 2008 年之后。然而，我最终得到一个没有观察结果的变量。

我使用了数据集的“国家”变量（location_id）和战斗死亡变量（bd_best）。

到目前为止我已经在 R 中做到了这一点：

library(dplyr)

filtered_data <- subset(dput(BattleDeaths_v22_1_conf), bd_best >= 1000 & year >= 2008)

filtered_data <- filtered_data %>%
     arrange(location_inc, year) %>%
     group_by(location_inc) %>%
     mutate(sum_deaths_two_years = lag(bd_best) + bd_best)

到目前为止一切顺利。

final_data <- filtered_data %>%
      group_by(location_inc) %>%
      filter(all(sum_deaths_two_years >= 2000))

现在我得到了一个具有 0 个观察值的变量。然而，我可以在原始数据集中看到，有些观察结果符合我的标准。

Answer 1

试试这个：

library(dplyr)

# Data ------------------------------
example_df <- tibble::tribble(
  ~location_inc,  ~year, ~bd_best,
  "Iraq",  2009L,    1036L,
  "Iraq",  2010L,     989L,
  "Iraq",  2011L,     864L,
  "Iraq",  2012L,     565L,
  "Iraq",  2013L,    1870L, # Desired
  "Iraq",  2014L,   13761L, # Desired
  "Iraq",  2015L,   10981L, # Desired
  "Iraq",  2016L,    9775L, # Desired
  "Iraq",  2017L,   10025L, # Desired
  "Iraq",  2018L,     866L,
  "Iraq",  2019L,     498L,
  "Iraq",  2020L,     671L,
  "Iraq",  2021L,     707L,
  "Iraq",  2022L,     335L,
  "Sudan", 2009L,     353L,
  "Sudan", 2010L,    1010L, # Desired
  "Sudan", 2011L,    1404L, # Desired
  "Sudan", 2012L,    1173L, # Desired
  "Sudan", 2013L,     594L,
  "Sudan", 2014L,     856L,
  "Sudan", 2015L,    1264L, # Desired
  "Sudan", 2016L,    1309L, # Desired
  "Sudan", 2017L,     160L,
  "Sudan", 2018L,     243L,
  "Sudan", 2020L,      45L,
  "Sudan", 2021L,      31L,
  "Sudan", 2022L,      47L)

# Code ------------------------------
example_df <- filter(
  example_df, 
  .by = location_inc,
  bd_best >= 1000, 
  lag(bd_best, default = -1) >= 1000 | lead(bd_best, default = -1) >= 1000)

# Outcome ---------------------------
example_df

# A tibble: 10 × 3
   location_inc  year bd_best
   <chr>        <int>   <int>
 1 Iraq          2013    1870
 2 Iraq          2014   13761
 3 Iraq          2015   10981
 4 Iraq          2016    9775
 5 Iraq          2017   10025
 6 Sudan         2010    1010
 7 Sudan         2011    1404
 8 Sudan         2012    1173
 9 Sudan         2015    1264
10 Sudan         2016    1309

来源：https://ucdp.uu.se/downloads/brd/ucdp-brd-dyadic-231-xlsx.zip

连续两年根据某些值在 R 中创建新变量

问题描述投票：0回答：1

1个回答

最新问题

连续两年根据某些值在 R 中创建新变量

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1