在 SAS 中转置多列

问题描述 投票:0回答:1

我有一个如下所示的数据集:

Account Number  6m      7m      8m      9m      10m     11m
1               Better  X < 10  X < 10  Better  X < 30  X < 30
2               X < 10  X < 20  X < 30  X < 20  X < 20  X < 20
3               Better  Better  Better  Better  X < 10  X < 20
4               X < 10  Better  Same    Same    Same    Same
5               Same    Better  Same    Same    Same    Same
6               Same    Same    Same    Better  Better  Better
7               Same    X < 10  X < 10  X < 10  X < 10  Better
8               Better  Better  Better  Better  Better  Better
9               X < 10  X < 10  X < 10  X < 20  X < 30  Better
10              X < 20  X < 30  X < 30  X < 30  X < 30  X < 30

每个单元格告诉我每个帐号 6-11 个月后发生了什么。我想将其转换为一个数据集,我可以从中创建图形等,因此想将其转置为如下所示:

Result  6m  7m  8m  9m  10m 11m
X < 10  3   3   3   1   2   0
X < 20  1   1   0   2   1   2
X < 30  0   1   1   1   2   1
Same    3   1   3   2   2   2
Better  1   2   1   2   2   4

如果有一种方法可以将每列的计数转换为百分比,那就更好了。

data have;
    infile datalines dlm='|';
    input "Account Number"n "6m"n$ "7m"n$ "8m"n$ "9m"n$ "10m"n$ "11m"n$;
    datalines;
1|Better|X < 10|X < 10|Better|X < 30|X < 30
2|X < 10|X < 20|X < 30|X < 20|X < 20|X < 20
3|Better|Better|Better|Better|X < 10|X < 20
4|X < 10|Better|Same|Same|Same|Same
5|Same|Better|Same|Same|Same|Same
6|Same|Same|Same|Better|Better|Better
7|Same|X < 10|X < 10|X < 10|X < 10|Better
8|Better|Better|Better|Better|Better|Better
9| X < 10|X < 10|X < 10|X < 20|X < 30|Better
10| X < 20|X < 30|X < 30|X < 30|X < 30|X < 30
;
run;
datatable sas dataset transpose frequency
1个回答
0
投票

首先,堆叠数据,以便我们进行一些计数:

data stack;
    set have;
    array charvars[*] _CHARACTER_;

    do i = 1 to dim(charvars);
        result = charvars[i];
        var    = vname(charvars[i]);
        output;
    end;

    keep result var;
run;

这让你:

result  var
Better  6m
X < 10  7m
X < 10  8m
Better  9m
X < 30  10m
X < 30  11m
...     ...

我确信有了这些数据,你可以用

proc report
做一些非常酷的事情,但这不是我特别了解的领域。相反,我们将通过其他几个步骤创建数据集。

我们可以折叠它并计算每个

result, var
组合中的值数量,然后计算其中每个
var
的百分比:

proc sql;
    create table count as
        select result, var, total, total / sum(total) as pct format=percent8.1
            from (select result, var, count(*) as total
                  from stack
                  group by result, var
                 )
            group by var
            order by result, var
    ;
quit;

这给我们带来了这个:

result  var total pct
Better  10m 2     20.0%
Better  11m 4     40.0%
Better  6m  3     30.0%
Better  7m  4     40.0%
Better  8m  2     20.0%
Better  9m  4     40.0%
...     ... ... ...

现在我们已经拥有将其转换为我们想要的格式所需的一切。

id
中的
proc transpose
语句将允许我们使用
var
作为每个转置列的名称。我们将在
result
之前完成此任务。

proc transpose data=count out=count_tpose(drop=_NAME_);
    by result;
    id var;
    var pct;
run;

这几乎让我们得到了我们想要的:

result  10m     11m     6m       7m     8m      9m
Better  20.0%   40.0%   30.0%   40.0%   20.0%   40.0%
Same    20.0%   20.0%   30.0%   10.0%   30.0%   20.0%
X < 10  20.0%   .       30.0%   30.0%   30.0%   10.0%
X < 20  10.0%   20.0%   10.0%   10.0%   .       20.0%
X < 30  30.0%   20.0%   .       10.0%   20.0%   10.0%

现在我们只需要通过以下方式清理它:

  1. 用 0 填充缺失值
  2. 将列重新排序为所需的顺序
  3. 重新排序
    result
    至所需顺序
/* Replace missing with 0 */
proc stdize data=count_tpose 
            out=want 
            missing=0 
            reponly;
run;

/* Fix sort order */
data want_sorted;
    
    /* Set variable order */
    length Result $10.
           "6m"n "7m"n "8m"n "9m"n "10m"n "11m"n 8.
    ;

    set want;
    
    select(result);
        when('X < 10') order = 1;
        when('X < 20') order = 2;
        when('X < 30') order = 3;
        when('Same')   order = 4;
        otherwise      order = 5;
    end;
run;

proc sort data=want_sorted out=want_sorted_final(drop=order);
    by order;
run;

这让我们得到了我们想要的最终结果:

Result  6m      7m      8m      9m      10m     11m
X < 10  30.0%   30.0%   30.0%   10.0%   20.0%   0.0%
X < 20  10.0%   10.0%   0.0%    20.0%   10.0%   20.0%
X < 30  0.0%    10.0%   20.0%   10.0%   30.0%   20.0%
Same    30.0%   10.0%   30.0%   20.0%   20.0%   20.0%
Better  30.0%   40.0%   20.0%   40.0%   20.0%   40.0%
© www.soinside.com 2019 - 2024. All rights reserved.