使用SAS保留A列中B列中的优先级变量(删除重复行),但不要删除B列中具有其他值的行

问题描述 投票:0回答:1

我需要一个 SAS 代码,其中 B 列针对 A 列运行,需要保留 B 列 =“红色”的值如果所有相等的 A 列值的 B 列值没有“红色”变量,则保留“蓝色”例如:

A       B
ABC12   red
ABC12   blue
ABC12   green
ABC13   green
ABC13   blue
ABC13   blue

代码执行后:

A       B
ABC12   red
ABC13   blue
data have;
input A $ B $;
datalines;
ABC12 red
ABC12 blue
ABC12 green
ABC13 green
ABC13 blue
ABC13 blue
;


data want;
set have;
by A;
retain keep;
if first.A then keep = 0;
if B = "red" then do;
    keep = 1;
    output;
end;
else if B = "blue" then keep = keep;
else keep = 0;
if last.A and keep then output;
drop keep;
run;

我得到 ABC12 在我的输出中是红色的,没有观察到 ABC13。有人有什么建议吗?

if-statement sas retain
1个回答
0
投票

试试这个:

data have;
input A $ B $;
datalines;
ABC12 red
ABC12 blue
ABC12 green
ABC13 green
ABC13 blue
ABC13 blue
ABC14 blue
ABC14 red
ABC14 blue
ABC15 red
ABC15 red
ABC15 green
;
run;
proc print;
run;

data want;
  length keep $ 5;
  drop keep;

  do until(last.A);
    set have;
    by A;
    
    if B='blue' and keep NE 'red' then keep=B;
    if B='red'                    then keep=B;
  end;

  do until(last.A);
    set have;
    by A;
    if keep=B then 
      do;
        output; 
        keep = ""; /* add this to only get 1 row if there are many, e.g.: red red green */
      end;
  end;
run;
proc print;
run;

并阅读 SAS 中的 DoW-loop,例如 here.


另一种方法是使用

POINT=

data want2;
  set have curobs=curobs;
  by A;

  if first.A then keep = 0;

  if B='blue' and keep<=0 then keep=-curobs;
  if B='red'              then keep= curobs;


  if last.A and keep then 
    do;
      keep=abs(keep);
      set have point=keep;
      output;
    end;
run;
proc print;
run;
© www.soinside.com 2019 - 2024. All rights reserved.