我需要一个 SAS 代码,其中 B 列针对 A 列运行,需要保留 B 列 =“红色”的值如果所有相等的 A 列值的 B 列值没有“红色”变量,则保留“蓝色”例如:
A B
ABC12 red
ABC12 blue
ABC12 green
ABC13 green
ABC13 blue
ABC13 blue
代码执行后:
A B
ABC12 red
ABC13 blue
data have;
input A $ B $;
datalines;
ABC12 red
ABC12 blue
ABC12 green
ABC13 green
ABC13 blue
ABC13 blue
;
data want;
set have;
by A;
retain keep;
if first.A then keep = 0;
if B = "red" then do;
keep = 1;
output;
end;
else if B = "blue" then keep = keep;
else keep = 0;
if last.A and keep then output;
drop keep;
run;
我得到 ABC12 在我的输出中是红色的,没有观察到 ABC13。有人有什么建议吗?
试试这个:
data have;
input A $ B $;
datalines;
ABC12 red
ABC12 blue
ABC12 green
ABC13 green
ABC13 blue
ABC13 blue
ABC14 blue
ABC14 red
ABC14 blue
ABC15 red
ABC15 red
ABC15 green
;
run;
proc print;
run;
data want;
length keep $ 5;
drop keep;
do until(last.A);
set have;
by A;
if B='blue' and keep NE 'red' then keep=B;
if B='red' then keep=B;
end;
do until(last.A);
set have;
by A;
if keep=B then
do;
output;
keep = ""; /* add this to only get 1 row if there are many, e.g.: red red green */
end;
end;
run;
proc print;
run;
并阅读 SAS 中的 DoW-loop,例如 here.
另一种方法是使用
POINT=
:
data want2;
set have curobs=curobs;
by A;
if first.A then keep = 0;
if B='blue' and keep<=0 then keep=-curobs;
if B='red' then keep= curobs;
if last.A and keep then
do;
keep=abs(keep);
set have point=keep;
output;
end;
run;
proc print;
run;