我正在尝试在特定的访问中计算基线平均值。例如,如果受试者没有进行指定的就诊,则将重新计算基线平均值,而忽略受试者。
这里是数据:
Subject Visit Value
001 Baseline 10
001 Visit 2 11
001 Visit 3 12
001 Visit 4 13
002 Baseline 11
002 Visit 2 12
002 Visit 4 13
002 Visit 5 14
003 Baseline 12
003 Visit 3 13
003 Visit 4 14
003 Visit 5 15
我想获得以下内容:
Visit BaselineMean VisitMean
Baseline 11 11
Visit 2 10.5 11.5
Visit 3 11 12.5
Visit 4 11 13.3
Visit 5 11.5 14.5
这是我每次拜访时的平均值表:
proc sql;
create table want as
select
visit,
mean(value) as meanValue
from have
group by visit;
任何见识将不胜感激。
考虑两个聚合的联接,其中一个对自身使用自联接:
proc sql;
CREATE TABLE want as
SELECT bagg.Visit, bagg.BaselineMean, vagg.VisitMean
FROM
(SELECT t2.Visit, MEAN(t1.Value) AS BaselineMean
FROM have t1
INNER JOIN have t2
ON t1.Subject = t2.Subject
AND t1.Visit = 'Baseline'
GROUP BY t2.Visit) bagg
INNER JOIN
(SELECT Visit, MEAN(Value) AS VisitMean
FROM have
GROUP BY Visit) vagg
ON bagg.Visit = vagg.Visit;
quit;
首先创建一个表,以具有每个主题的基线值,如下所示:
proc sql;
create table baseline as
select distinct subject, value
from t1
where visit = 'Baseline'
;
然后用基线值扩充主表。注意,如果没有主题的基线记录,则使用合并:
proc sql;
create table inter as
select t1.*m, coalesce(b.value, 0) as b_val
from t1 left join baseline b
on t1.subject = b.subject
;
quit;
最终按以下方式计算基线和访问的平均值:
proc sql;
select visit, mean(b_val) as BaselineMean, mean(value) as visitMean
from inter
group by visit
;
quit;