我有一个数据集,其中包含个人列表以及他们是否参与某个项目多年。
身份证 | kl2_2016 | kl2_2017 |
---|---|---|
琼斯 | 1 | 0 |
b史密斯 | 0 | 1 |
...等等
我希望它看起来像这样
观察 | 年 | 总服务 |
---|---|---|
1 | 2016 | 20 |
2 | 2017 | 71 |
我正在使用此代码
proc sql;
create table year_service as
select
distinct
sum(kl2_2016) as kl2_2016,
sum(kl2_2017) as kl2_2017,
sum(kl2_2018) as kl2_2018,
sum(kl2_2019) as kl2_2019,
sum(kl2_2020) as kl2_2020,
sum(kl2_2021) as kl2_2021,
sum(kl2_2022) as kl2_2022,
sum(kl2_2023) as kl2_2023,
sum(kl2_2024) as kl2_2024,
sum(kl2_2025) as kl2_2025
from have
;
quit;
proc transpose data=year_service
out=year_service_long;
var kl2_2016-kl2_2025;
run;
但这不是我想要的。有人可以帮助我吗?
让
PROC TRANSPOSE
为您处理列要容易得多,这将更优雅地处理输入数据中的更改。我会像这样处理它:
/* Example input. */
data have;
ID = "ajones"; kl2_2016 = 1; kl2_2017 = 0; output;
ID = "bsmith"; kl2_2016 = 0; kl2_2017 = 1; output;
run;
proc sort data=have; by id; run;
/* Put the different years (values of KL2_:) on separate records. */
proc transpose data=have out=long;
by id;
var kl2_:;
run;
/* Summarize in whatever way you want. */
proc sql;
create table want as
select distinct input(scan(_name_, 2, "_"), best.) as year,
sum(col1) as total_service from long group by 1;
quit;