如何对数据帧中的值相对于具有多个年份的数据帧中特定年份的其他值进行排名?

问题描述 投票:0回答:1

大家好,我在 R 中遇到问题,使用宾夕法尼亚大学世界表数据按 1950 年的平均消费收入对国家进行排名。我有多年(即从 1950 年到 1959 年)所有国家/地区的数据。

我想做的主要事情是对1950年的国家相对于1950年其他国家的收入进行排名,对于1951年的国家对1951年相对于其他国家的收入进行排名,等等。我尝试一起使用“ifelse”和“percent_rank”函数以这种方式对国家/地区进行排名,但它无法正常工作(它只是对所有年份中该国家/地区相对于其他国家/地区的收入进行排名),因此我需要帮助我做错了什么。

下面是R代码。

谢谢。

install.packages("pwt10","dplyr")
library(pwt10)
data(pwt10.01)
library(dplyr)

PWT1950 <- subset(pwt10.01, pwt10.01$year %in% c("1950","1951","1952","1953","1954","1955","1956","1957","1958","1959") & i_cig %in% c("benchmark","extrapolated","interpolated") & i_xr=="market")

PWT1950$IncomeRank <- ifelse(PWT1950$year==1950,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1951,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1952,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1953,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1954,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1955,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1956,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1957,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1958,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1959,percent_rank(PWT1950$ccon/PWT1950$pop),
percent_rank(PWT1950$ccon/PWT1950$pop)))))))))))

输入之前列出的命令后,我期望此命令

subset(PWT1950,year==1959,select=c(IncomeRank,country)) 
会给我与这些命令相同的结果:

PWT1959 <- subset(pwt10.01,year=="1959" & i_cig %in% c("benchmark","extrapolated","interpolated") & i_xr=="market")
PWT1959$IncomeRank <- percent_rank(PWT1959$ccon/PWT1959$pop)
subset(PWT1959,select=c(IncomeRank,country))

考虑到这两个命令给了我不同的结果,我怀疑我之前犯了一个错误。

r ranking
1个回答
0
投票

感谢@GregorThomas,这个问题的解决方案是基金。解决这个问题的方法是使用

pwt10.01 |> filter(year %in% 1950:1959 & i_cig %in% c("benchmark","extrapolated","interpolated") & i_xr=="market") |> mutate(IncomeRank = percent_rank(ccon/pop), .by = year)
而不是

PWT1950$IncomeRank <- ifelse(PWT1950$year==1950,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1951,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1952,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1953,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1954,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1955,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1956,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1957,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1958,percent_rank(PWT1950$ccon/PWT1950$pop),
ifelse(PWT1950$year==1959,percent_rank(PWT1950$ccon/PWT1950$pop),
percent_rank(PWT1950$ccon/PWT1950$pop)))))))))))

要比较名义收入,命令是

PWT1950 |> mutate(Nominal_Income_Rank = percent_rank(ccon * pl_con / pl_con[isocode == "USA"] / pop))

感谢 GregorThomas 的所有帮助。我真的很感激。

© www.soinside.com 2019 - 2024. All rights reserved.