Create new variable in R based on certain values 2 years in a row

I am trying to use the UCDP Battle-related Deaths Dataset, called BattleDeaths_v22_1_conf from https://ucdp.uu.se/downloads/ (See under UCDP Battle-Related Deaths Dataset version 23.1)

I want to create a new variable or dataset that only includes countries with 1000 battle-related deaths 2 years in a row – and only after 2008.
However, I end up with a variable with no observations.

I used the dataset’s ‘country’ variable (location_id) and the battle-death variable (bd_best).

So far I have done this in R:

library(dplyr)

filtered_data <- subset(dput(BattleDeaths_v22_1_conf), bd_best >= 1000 & year >= 2008)

filtered_data <- filtered_data %>%
     arrange(location_inc, year) %>%
     group_by(location_inc) %>%
     mutate(sum_deaths_two_years = lag(bd_best) + bd_best)

So far so good.

final_data <- filtered_data %>%
      group_by(location_inc) %>%
      filter(all(sum_deaths_two_years >= 2000))

Now I end up with a variable with 0 observations. However, I can see in the original dataset, that there are observations that fit my criteria.

  • Where can we find BattleDeaths_v22_1_conf – is it a dataset that is in R or an R package?

    – 

  • It is a dataset:) I downloaded it from ucdp.uu.se/downloads

    – 




  • Thanks! Could you edit your question to include it as reproducible code, (i.e., dput(BattleDeaths_v22_1_conf)?

    – 

  • The data at your link appears to be 23.1, but your variable name suggests 22.1, not sure if that’ll break things; further, is it the “dyadic” or “conflict” data?

    – 

  • 1

    @jpsmith Yes, of course – done!

    – 

Leave a Comment