来自OECD数据的STRATUM太长了,为简单起见,我使用了这个名称,并希望将其简化为更简短,更精确的命名,如下面的代码所示。
pisaMas[,`:=`
(SchoolType = c(ifelse(STRATUM == "National Secondary School", "Public",
ifelse(STRATUM == "Religious School", "Religious",
ifelse(STRATUM == "MOE Technical School", "Technical",0)))))]
pisaMas[,table(SchoolType)]
我想知道是否有一种简单的方法可以使用data.table包来解决此问题。
在这种情况下,data.table
的当前开发版本具有新功能fcase
(根据SQL CASE WHEN
建模):
pisaMas[ , SchoolType := fcase(
STRATUM == "National Secondary School", "Public",
STRATUM == "Religious School", "Religious",
STRATUM == "MOE Technical School", "Technical",
default = ''
)]
pisaMas[ , table(SchoolType)]
要获得开发版本,请尝试
install.packages(
'data.table', type = 'source',repos = 'http://Rdatatable.github.io/data.table'
)
如果简单安装不起作用,您可以查看安装Wiki以获得更多详细信息:
https://github.com/Rdatatable/data.table/wiki/Installation
您也可以通过查找表解决此问题,有关详细信息,请参见此问答:
这是我经过一番思考后得出的结果。
#' First I create a function (rname.SchType) that have oldname and newname using else if:
rname.SchType <- function(x){
if (is.na(x)) NA
else if (x == "MYS - stratum 01: MOE National Secondary School\\Other States")"Public"
else if(x == "MYS - stratum 02: MOE Religious School\\Other States")"Religious"
else if(x == "MYS - stratum 03: MOE Technical School\\Other States")"Technical"
else if(x == "MYS - stratum 04: MOE Fully Residential School")"SBP"
else if(x == "MYS - stratum 05: non-MOE MARA Junior Science College\\Other States")"MARA"
else if(x == "MYS - stratum 06: non-MOE Other Schools\\Other States")"Private"
else if(x == "MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”")"Perlis Fully Residential"
else if(x == "MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”")"Putrajaya Fully Residential"
else if(x == "MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”")"Labuan Fully Residential"
}
通过使用我刚创建的函数,通过在data.table中应用基数R(应用),将它通过一行代码粘贴到了data.table中,从而避免了代码混乱并且看起来更加简单:
pisaMalaysia[,`:=`(jenisSekolah = sapply(STRATUM,rname.SchType))]