我有这样的数据。
data.sample <- read_table2('score_label treatment score data1 data2 data3
A treatment 1 1 t yt
A treatment 2 1 t yt
A treatment 3 5 f yt
B treatment 1 5 f yt
B treatment 2 5 f yt
B treatment 3 5.5 g yt
B treatment 4 6.8 t yt
C treatment 1 9.4 t yt
C treatment 2 10.7 f yt
C treatment 3 12 j yt
C treatment 4 13.3 t yt
C control 1 14.6 t yt
C control 3 18.5 k yt
C control 4 19.8 t yt')
我想创建这样的df。每个分数标签处理组的分数从1-4开始,并且0填充到以前没有该分数的单元格中。
output<- read_table2('score_label treatment score data1 data2 data3
A treatment 1 1 t yt
A treatment 2 1 t yt
A treatment 3 5 f yt
A treatment 4 0 0 0
B treatment 1 5 f yt
B treatment 2 5 f yt
B treatment 3 5.5 g yt
B treatment 4 6.8 t yt
C treatment 1 9.4 t yt
C treatment 2 10.7 f yt
C treatment 3 12 j yt
C treatment 4 13.3 t yt
C control 1 14.6 t yt
C control 2 0 0 0
C control 3 18.5 k yt
C control 4 19.8 t yt')
我曾考虑过这样做以创建一个新的得分列,但按我希望的那样无法正常工作。任何建议表示赞赏!
data.sample %>%
group_by(score_lable, treatment) %>%
mutate(new_score=seq(4))
我们可以将complete
和fill
一起使用
library(dplyr)
library(tidyr)
data.sample %>%
group_by(score_label, treatment) %>%
complete(score = unique(data.sample$score),
fill = list(data1 = 0, data2 = 0, data3 = '0'))
如果fill
有很多列,则可以将其构造为list
nm1 <- names(data.sample)[startsWith(names(data.sample), 'data')]
fillcols <- setNames(rep(list(0), length(nm1)), nm1)
data.sample %>%
group_by(score_label, treatment) %>%
complete(score = unique(data.sample$score), fill = fillcols)