我有一个数据框:
Var_1 = c("A","B","C","D","E","F","G","H")
Var_2 = c(0,1,0,2,1,0,0,1)
DF = data.frame(Var_1,Var_2)
print(DF)
Var_1 Var_2
1 A 0
2 B 1
3 C 0
4 D 2
5 E 1
6 F 0
7 G 0
8 H 1
我需要使用在Var_2中找到的值来确定N,在数据框中插入用NA填充的N空白行。这些新行应在Var_2值> = 1之后立即插入。所以我希望我的数据框看起来像这样:
print(DF)
Var_1 Var_2
1 A 0
2 B 1
3 <NA> NA
4 C 0
5 D 2
6 <NA> NA
7 <NA> NA
8 E 0
9 F 0
10 G 0
11 H 1
12 <NA> NA
我非常坚持这一点,任何帮助将不胜感激。谢谢。
ind <- which(DF$Var_2 > 0)
ind
# [1] 2 4 5 8
starts <- 1L + unique(c(0L, head(ind, n = -1)))
stops <- unique(c(ind, nrow(DF))) # in case the last !0 is not on bottom row
starts
# [1] 1 3 5 6
stops
# [1] 2 4 5 8
DFaug_list <- Map(
function(a, b) rbind(DF[a:b,], DF[b,][rep(NA, DF$Var_2[b]), ]),
starts, stops)
我们现在有一个框架列表:
str(DFaug_list) # List of 4 # $ :'data.frame': 3 obs. of 2 variables: # ..$ Var_1: Factor w/ 8 levels "A","B","C","D",..: 1 2 NA # ..$ Var_2: int [1:3] 0 1 NA # $ :'data.frame': 4 obs. of 2 variables: # ..$ Var_1: Factor w/ 8 levels "A","B","C","D",..: 3 4 NA NA # ..$ Var_2: int [1:4] 0 2 NA NA # $ :'data.frame': 2 obs. of 2 variables: # ..$ Var_1: Factor w/ 8 levels "A","B","C","D",..: 5 NA # ..$ Var_2: int [1:2] 1 NA # $ :'data.frame': 4 obs. of 2 variables: # ..$ Var_1: Factor w/ 8 levels "A","B","C","D",..: 6 7 8 NA # ..$ Var_2: int [1:4] 0 0 1 NA
并且我们要做的全部工作就是将它们与do.call
结合使用,或者使用data.table
或dplyr
包中的函数:
DFaug <- do.call(rbind.data.frame, DFaug_list) DFaug # Var_1 Var_2 # 1 A 0 # 2 B 1 # NA <NA> NA # 3 C 0 # 4 D 2 # NA1 <NA> NA # NA.1 <NA> NA # 5 E 1 # NA2 <NA> NA # 6 F 0 # 7 G 0 # 8 H 1 # NA3 <NA> NA DFaug <- data.table::rbindlist(DFaug_list) DFaug <- dplyr::bind_rows(DFaug_list)
s <- rep(sequence(nrow(DF)), DF$Var_2 + 1)
DFnew <- DF[s,]
DFnew[duplicated(s),] <- NA
DFnew
# Var_1 Var_2
#1 A 0
#2 B 1
#2.1 <NA> NA
#3 C 0
#4 D 2
#4.1 <NA> NA
#4.2 <NA> NA
#5 E 1
#5.1 <NA> NA
#6 F 0
#7 G 0
#8 H 1
#8.1 <NA> NA