整个表格在SAS中的出现频率

问题描述 投票:0回答:1

在SAS中是否可以得到整张表的频率?例如,我想计算一整张表中有多少个是或不是?谢谢,谢谢

sas frequency
1个回答
2
投票

A hash 组件对象有键,可以跟踪 .FIND 的关键摘要变量中的引用。keysum: 在实例化时提供的标签属性。 在实例化时提供的 keysum 变量,当被 1suminc: 变量会计算出一个频率数。

data have;
  * Words array from Abstract;
  * "How Do I Love Hash Tables? Let Me Count The Ways!";
  * by Judy Loren, Health Dialog Analytic Solutions;
  * SGF 2008 - Beyond the Basics;
  * https://support.sas.com/resources/papers/proceedings/pdfs/sgf2008/029-2008.pdf;

  array words(17) $10 _temporary_ (
    'I' 'love' 'hash' 'tables'
    'You' 'will' 'too' 'after' 'you' 'see'
    'what' 'they' 'can' 'do' '--' 'Judy' 'Loren'
  );


  call streaminit(123);
  do row = 1 to 127;
    attrib  RESPONSE1-RESPONSE20 length = $10;
    array RESPONSE RESPONSE1-RESPONSE20;
    do over RESPONSE;
      RESPONSE = words(rand('integer', 1, dim(words)));
    end;
    output;
  end;
run;

data _null_;
  set have;

  if _n_ = 1 then do;
    length term $10;
    call missing (term);
    retain one 1;
    retain count 0;

    declare hash bins(suminc:'one', keysum:'count');
    bins.defineKey('term');
    bins.defineData('term');
    bins.defineDone();
  end;

  set have end=lastrow;
  array response response1-response20;

  do over response;
    if bins.find(key:response) ne 0 then do;
      bins.add(key:response, data:response, data:1);
    end;
  end;

  if lastrow;

  bins.output(dataset:'all_freq');
run;

Frequency Table


原答案,推测只有Yes和No

是的,你可以对数值进行数组,对每个NoYes值计算为01标志,然后用SUM来计算0和1。 SUM只在处理0和1的时候计算FREQ。

例如:将主数据转置,然后用SUM计算0和1的频率。

data have;
  call streaminit(123);
  do row = 1 to 100;
    attrib  ANSWER1-ANSWER20 length = $3;
    array ANSWER ANSWER1-ANSWER20;
    do over ANSWER; ANSWER = ifc(rand('uniform') > 0.15,'Yes','No'); end;
    output;
  end;
run;

data want(keep=freq_1 freq_0);
  set have end=lastrow;
  array ANSWER ANSWER1-ANSWER20;
  array X(20) _temporary_;

  do over ANSWER; x(_I_) = ANSWER = 'Yes'; end;

  freq_1          + sum (of X(*));
  freq_0 + dim(X) - sum (of X(*));

  if lastrow;
run;

enter image description here


0
投票

转置你的主数据,然后做一个proc freq. 这是完全动态的,如果问题的数量或响应的规模,它就会缩放。你需要让所有的变量都是相同的类型--字符或数字。

*generate fake data;
data have;
call streaminit(99);
array q(30) q1-q30;

do i=1 to 100;
do j=1 to dim(q);
q(j) = rand('bernoulli', 0.8);
end;
output;
end;

run;

*flip it to a long format;
proc transpose data=have out=long;
by I;
var q1-q30;
run;

*get the summaries needed;
proc freq data=long;
table col1;
run;

你应该得到如下的输出。

The FREQ Procedure

COL1    Frequency   Percent Cumulative
Frequency   Cumulative
Percent
0   581 19.37   581 19.37
1   2419    80.63   3000    100.00
© www.soinside.com 2019 - 2024. All rights reserved.