使用 R 中的模拟数据使用混合效应模型计算多中心 RCT 的样本量

Question

我目前正在开展一项多中心（九个中心）随机对照试验，比较治疗与安慰剂（1:1），主要结果是延长妊娠天数。临床相关的最小天数差异为 5 天（治疗组与安慰剂组之间）。延长天数的分布存在偏差，根据之前的研究推测其呈对数正态分布。中心对延期的特定态度会对主要结果产生影响，并且这种态度可能会在试验过程中随着时间的推移而改变。这种影响的大小我们真的很难估计。

根据反馈，我决定使用模拟数据（根据可比较的单中心试验的汇总统计数据创建；我们没有太多来自我们的试验的数据）来计算试验所需的样本量（功效 80%，alpha 0.05）不幸的是，在对数正态混合效应模型中，使用中心作为随机效应，使用治疗作为固定效应。

这是我从之前的试验中得到的信息：对数转换延长数据的平均值（安慰剂）：2.25 对数转换延长数据的 SD（安慰剂）：1.23 未转换延长数据的平均值（安慰剂）：17.6 天未转换的延长数据（安慰剂）的 SD：18 天试验中的患者总数：180 名（每组 90 名）

我的代码基于 5 天的最小临床相关差异 (MCRD)：


# Define parameters
## I used the same sample size as the other trial, 180 patients (90 per arm)
n_centers <- 9  # Number of centers
n_per_center_per_group <- 10  # Number of participants per treatment group per center
total_n <- n_centers * n_per_center_per_group * 2  # Total number of participants (with 2 treatment arms)
n <- 90

# Generate treatment variable for placebo
treatment_placebo <- rep("Placebo", n_per_center_per_group * n_centers)

# Generate treatment variable for medicine
treatment_medicine <- rep("Medicine", n_per_center_per_group * n_centers)

treatment <- factor(c(treatment_placebo, treatment_medicine))

# Generate center variable for each treatment group
center_placebo <- rep(1:n_centers, each = n_per_center_per_group)
center_medicine <- rep(1:n_centers, each = n_per_center_per_group)

center <- factor(c(center_placebo, center_medicine))

# Combine placebo and metformin data
simulated_data <- data.frame(treatment = factor(c(treatment_placebo, treatment_medicine)),
                             center = factor(c(center_placebo, center_medicine)))


# Generate log-normal prolongation of gestation with different means for treatment and placebo
mean_placebo <- 2.25 # mean of log normal transformed data in previous trial
MCRD <- log (16/11) #MCRD of 5 days on the untransformed scale: 11 days estimated as a mean prolongation in placebo and 16 days in treatment group, transformed to put it on log-scale. 
mean_treatment <- mean_placebo + MCRD 
sd_log <- 1.23  # Standard deviation on the log scale (edited, incorrectly defined as 0.7 in previous version)

log_prolongation_placebo <- rlnorm(n, meanlog = mean_placebo, sdlog = sd_log)
log_prolongation_treatment <- rlnorm(n, meanlog = mean_treatment, sdlog = sd_log)

log_prolongation_placebo <- log(log_prolongation_placebo)
log_prolongation_treatment <- log(log_prolongation_treatment)

log_prolongation <- c(log_prolongation_placebo, log_prolongation_treatment)

# Create data frame
simulated_data <- data.frame(treatment = factor(c(treatment_placebo, treatment_metformin)),
                             log_prolongation = log_prolongation,
                             center = factor(c(center_placebo, center_metformin)))

#RUN SIMULATION WITH SIMULATED DATA
library(lme4)
lmer_model <- lmer(log_prolongation ~ treatment + (1 | center), data = simulated_data)
summary(lmer_model)

powerSim(lmer_model, nsim=1000, alpha = 0.05)

我对 R/统计缺乏经验，我担心我可能没有为模型提供正确的输入（特别是当我来回对数转换数据时）。

编辑澄清：问题：

这段代码中MCRD的计算对吗？
尺度编码（对数正态）正确吗？我是不是不小心混合了对数正态和未变换的尺度？
您对计算样本量的方法有什么反馈吗？

Answer 1

在对数正常量表上，差异是倍数变化，因此您需要将 MCRD 与您正在使用的安慰剂平均值相关联。如果您对
```
mean_placebo
```
和
```
mean_treatment
```
的当前值进行反向转换，您会发现它们相隔 4.3 天，而不是您想要的 5 天。你可以使用：
```
mean_placebo <- 2.25 # mean of log normal transformed data in previous trial
mean_treatment <- log(exp(mean_placebo) + 5) #MCRD of 5 days on the untransformed scale
MCRD <- mean_treatment - mean_placebo
```
当回到“天”尺度时，使 MCRD 为 5 天。旁注，您说过对数转换数据的 SD 是 1.23，但您在代码中使用了 0.7，值得检查！
由于您的均值和 SD 均采用对数正态尺度，因此您只需使用
```
rnorm
```
进行模拟即可，无需在对数尺度上来回转换。如果运行下面的代码块，您会发现使用
```
rlnorm
```
模拟数据日志和使用
```
rnorm
```
模拟数据得到了相同的结果。
```
set.seed(42)
hist(log(rlnorm(n, meanlog = mean_placebo, sdlog = sd_log)))
set.seed(42)
hist(rnorm(n, mean = mean_placebo, sd = sd_log))
```
这样你就可以使用
```
log_prolongation_placebo <- rnorm(n, mean = mean_placebo, sd = sd_log)
log_prolongation_treatment <- rnorm(n, mean = mean_treatment, sd = sd_log)
```
无需进一步改造。
这是一项多中心试验，因此您需要包括中心之间的变异性。我相信您当前的 SD 来自单中心试验，因此用作中心内变异性是有意义的，但您需要搜索文献以找到用于中心间变异性的合理值。当你有这个用途时
```
log_prolongation <- log_prolongation + rnorm(n_centers, 0, betweenCentreSD_onLogScale)[factor(c(center_placebo, center_medicine))]
```
在创建最终模拟数据集以包含这种额外的可变性之前。

希望这一切都是有意义的，祝试验顺利！

使用 R 中的模拟数据使用混合效应模型计算多中心 RCT 的样本量

问题描述投票：0回答：1

1个回答

最新问题

使用 R 中的模拟数据使用混合效应模型计算多中心 RCT 的样本量

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1