我从联合国商品贸易统计数据库下载了高度概括的数据https://comtradeplus.un.org/TradeFlow
ns_eu_category
变量是我自己将世界划分为以下区域:
“东亚和太平洋”、“全球北方”、“拉丁美洲和加勒比”、“中东和北非”、“非欧盟前苏联集团国家”、“南亚”和“撒哈拉以南非洲”。我认为这不是问题的根源,所以我们现在可以忽略确切的划分。
> longterm_trade_data
# A tibble: 7,364 × 6
ns_eu_category year sitc_code import_or_export value sector
<chr> <dbl> <chr> <chr> <dbl> <chr>
1 East Asia and Pacific 1962 0 Export 946694358 Food And Live Animals
2 East Asia and Pacific 1962 0 Import 745286120 Food And Live Animals
3 East Asia and Pacific 1962 1 Export 60846922 Beverages And Tobacco
4 East Asia and Pacific 1962 1 Import 67321814 Beverages And Tobacco
5 East Asia and Pacific 1962 2 Export 1479804622 Crude Materials, Inedible, Except Fuels
6 East Asia and Pacific 1962 2 Import 640428682 Crude Materials, Inedible, Except Fuels
7 East Asia and Pacific 1962 3 Export 482623764 Mineral Fuels, Lubric. And Related Mtrls
8 East Asia and Pacific 1962 3 Import 416870707 Mineral Fuels, Lubric. And Related Mtrls
9 East Asia and Pacific 1962 4 Export 66775599 Animal And Vegetable Oils,Fats And Waxes
10 East Asia and Pacific 1962 4 Import 42687574 Animal And Vegetable Oils,Fats And Waxes
# ℹ 7,354 more rows
# ℹ Use `print(n = ...)` to see more rows
我将这些汇总统计数据转化为百分比,以便我可以将其放入面积图中:
trade_data_sector <- longterm_trade_data %>%
group_by(ns_eu_category, year, import_or_export) %>%
mutate(total_of_sectors = sum(value)) %>%
ungroup() %>%
drop_na() %>%
mutate(percent = value / total_of_sectors)
我尝试制作面积图
# "East Asia and Pacific" "Global North"
# "Latin America and Caribbean" "Middle East and North Africa" "Non-EU Former Soviet Bloc Countries"
# "South Asia" "Sub-Saharan Africa"
region <- "Sub-Saharan Africa"
ix <- "Export"
trade_data_sector %>%
mutate(truncated_name = sector %>% substr(0L, 10L),
descriptor = paste0(sitc_code, ": ", truncated_name)) %>%
filter(ns_eu_category == region, import_or_export == ix) %>%
ggplot(aes(x = year, y = percent, fill = descriptor)) +
geom_area() +
theme_minimal() +
labs(title = paste0(import_or_export, "s in ", region, " Over Time"),
caption = "Source: UN COMTRADE Database 1962-2023") +
scale_y_continuous(breaks = seq(from = 0, to = 1, by = 0.1), labels = scales::percent, limits = c(0, 1)) +
scale_x_discrete(limits = 1962:2023, expand = c(0,0)) +
theme(
# panel.grid.major.y = element_line(color = "dark gray", linewidth = 0.1, linetype = "dashed"),
# panel.grid.major.x = element_blank(),
axis.ticks.x=element_line(linewidth=0.2),
axis.text.x = element_text(size = 6, family=my_font, angle=-90, vjust=0.5),
axis.title.x = element_text(size = 8, family=my_font),
axis.text.y=element_text(size = 6, family=my_font),
# axis.ticks.y=element_line(),
axis.title.y = element_text(size = 8, family=my_font),
panel.grid = element_blank(),
legend.position="bottom",
plot.title = element_text(size = 12, family=my_font),
plot.subtitle = element_text(size = 10, family=my_font),
legend.title = element_text( size=8, family=my_font),
legend.text = element_text( size=8, family=my_font),
strip.text = element_text(size=8, family=my_font),
legend.key.size = unit(0.3, "cm"),
plot.caption = element_text(size = 7, color="dark gray", family=my_font)
)
结果是这样的:
注意:2010-2022 年的数据现在缺失,因此可以忽略图表的该部分。
它不仅看起来比应有的更加不稳定。有整个部分的 SITC 代码 0:食品和活体动物刚刚消失。但正如我们在下图中看到的,这个金额从来没有为零
sector_code <- "0"
trade_data_sector %>%
filter(ns_eu_category == region, import_or_export == ix, sitc_code == sector_code) %>%
ggplot(aes(x = year, y = value)) +
geom_line() +
theme_minimal() +
labs(title = paste0(import_or_export, "s in ", region, " Over Time (Sector ", sector, ")"),
caption = "Source: UN COMTRADE Database 1962-2023") +
scale_x_discrete(limits = 1962:2022) +
# scale_y_continuous(breaks = seq(from = 0, to = 600, by = 100), limits=c(0,700)) +
theme(
panel.grid.major.y = element_line(color = "dark gray", linewidth = 0.1, linetype = "dashed"),
# panel.grid.major.x = element_blank(),
# axis.ticks.x=element_blank(),
axis.text.x = element_text(size = 6, family=my_font, angle=-90, vjust=0.5),
axis.title.x = element_text(size = 8, family=my_font),
axis.text.y=element_text(size = 6, family=my_font),
# axis.ticks.y=element_line(),
axis.title.y = element_text(size = 8, family=my_font),
panel.grid = element_blank(),
legend.position="bottom",
plot.title = element_text(size = 10, family=my_font),
plot.subtitle = element_text(size = 8, family=my_font),
legend.title = element_text( size=8, family=my_font),
legend.text = element_text( size=8, family=my_font),
strip.text = element_text(size=8, family=my_font),
legend.key.size = unit(0.3, "cm"),
plot.caption = element_text(size = 7, color="dark gray", family=my_font)
)
这可能是什么原因造成的?这种情况不仅仅发生在撒哈拉以南非洲地区,世界其他地区也存在这种差距。
问题是在
limits = c(0, 1)
内设置scale_y_continuous
的结果。通过删除限制,图表现在看起来正常