当观察到的组在SAS中不包含某些值时删除

Question

请参阅下表，如果ID的组中至少有一个具有第1天到第3天（允许重复），则该ID被认为是完整的。我需要删除组中没有完整的第一天到第三天的ID。

ID   Group     Day
1     A        1
1     A        1
1     A        2
1     A        3
1     B        1
1     B        2
2     A        1
2     A        3
2     B        2

预期结果

ID   Group     Day
1     A        1
1     A        1
1     A        2
1     A        3
1     B        1
1     B        2

根据此参考，Delete the group that none of its observation contain the certain value in SAS我已经尝试过下面的代码，但无法删除ID 2。

PROC SQL;
CREATE TABLE TEMP AS SELECT
* FROM HAVE
GROUP BY ID
HAVING MIN(DAY)=1 AND MAX(DAY)=3
;QUIT;

PROC SQL;
CREATE TABLE TEMP1 AS SELECT
* FROM TEMP WHERE ID IN
(SELECT ID FROM TEMP
WHERE DAY=2)
;QUIT;

Answer 1

您可以使用删除列表查询数据集。例如：

proc sql noprint;
    create table want as
        select *
        from have
        where cats(group, id) NOT IN(select cats(group, id) from removal_list)
    ;
quit;

创建删除列表

此方法将使您不必在所有ID，组和日期上进行笛卡尔乘积运算即可创建删除列表。

假设您的数据按ID，group和day排序。

对于每个ID，组中的第一天必须为1
对于每个ID，第一天之后的组中的所有天与前一天的差必须为1

代码：

data removal_list;
    set have;
    by ID Group Day;
    retain flag_remove_group;

    lag_day = lag(day);

    /* Reset flag_remove_group at the start of each (ID, Group).
       Check if the first day is > 1. If it is, set the removal flag.
    */
    if(first.group) then do;
        call missing(lag_day);

        if(day > 1) then flag_remove_group = 1;
            else flag_remove_group = 0;
    end;

    /* If it's not the first (ID, Group), check if days 
       are skipped between observations 
    */
    if(NOT first.group AND (day - lag_day) > 1) then flag_remove_group = 1;

    if(flag_remove_group) then output;

    keep id group;
run;

Answer 2

所以您想找到一组ID值，其中ID至少具有所有三个DAY值的GROUP。查找ID列表作为子查询，并使用它来对原始数据进行子集设置。

子查询中的关键是您希望DAY有3个不同的值。如果您的数据可以具有DAY的其他值（例如缺失或4），则使用WHERE子句仅保留您要计数的值。

proc sql;
create table want as
  select * from have
  where id in 
   (select id from have 
    where day in (1,2,3)
    group by id,group
    having count(distinct day)=3
   )
;
quit;

当观察到的组在SAS中不包含某些值时删除

问题描述投票：0回答：2

2个回答

最新问题

当观察到的组在SAS中不包含某些值时删除

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2