geom_bar,如何只出现x最高频率?

问题描述 投票:0回答:1

我正在处理一个关于国家支持的网络攻击的数据框(因此我的主要三个变量是日期、发起人和受害者)。我想创建一个 geom_bar 每年,网络攻击的前五名受害者将出现。

我不确定如何为此制作一个可重现的示例。我做了一个版本,其中出现了前 5 名受害者,但它并没有反映多年来目标的变化。

cyber%>%
  filter(Sponsor_sep == "China" & 
         Victims_sep %in% c("United States", "China", "Japan", "South Korea", "India"))%>%
  ggplot() + 
  geom_bar(mapping = aes(x = Year, fill = Victims_sep))
r geom-bar geom
1个回答
0
投票

OK,我发现这是一个有趣的脑筋急转弯,所以我试了一下。我首先创建了一些要使用的数据,但并不是因为这些数据是随机抽取的,因此生成的图形不是很有趣。然而,即使代码很笨重,它似乎也能正常工作。

library(tidyverse)

# Create the data and add some extra countries so the output varies
cyber <- tibble(Year=sample(seq(2005,2022,1),50000,replace = T),
                Victims_sep=sample(c("United States", "China", "Japan", "South Korea", "India",
                                     'England','Spain','Vietnam','Canada','France','Bangladesh','Taiwan','Morocco'),
                                   50000,
                                   replace = T))

# Original plot from OP but with more countries 
cyber %>% 
ggplot() + 
  geom_bar(mapping = aes(x = Year, fill = Victims_sep))

# New plot
 cyber %>% 
  group_by(Year,Victims_sep) %>% 
  summarise(n=n()) %>% # get the number of attached in each year for each country
  ungroup() %>% 
  group_by(Year) %>% 
# get the number of attacks for the country with the most through 5th most in each year
  mutate(max_victim=max(n), 
         len=length(n),
         second=sort(n,partial=len-1)[len-1],
         third=sort(n,partial=len-2)[len-2],
         fourth=sort(n,partial=len-3)[len-3],
         fifth=sort(n,partial=len-4)[len-4]) %>% 
  rowwise() %>% 
  mutate(top5=ifelse(n %in% max_victim:fifth,1,0)) %>% # create an index 
  filter(top5==1) %>%  # keep only index values equal to 1
   
   ggplot() + 
   geom_col(mapping = aes(x = Year,y=n, fill = Victims_sep)) # use geom_col to apply the n value
© www.soinside.com 2019 - 2024. All rights reserved.