修复更新数据并检查更新数据条件的循环

Question

我一直在尝试编写一个代码来检查 a[i-1] 和 a[i] 的条件，该代码将使用 b[i-1] 和 c[i] 的值更新 b[i]，如果条件失败则 b[i] 必须更新为 0

我当前的代码是：

#R
library(dplyr)
update_b <- function(data) {
  for (i in 2:nrow(data)) {
    if (!is.na(data$a[i]) & !is.na(data$a[i-1]) & data$a[i] < 60 & data$a[i-1] < 60) {
      data$b[i] <- data$c[i] + data$b[i-1]
    } else {
      data$b[i] <- 0
    }
  }
  return(data)
}

result <- data_frame %>%
  group_by(number) %>%
  arrange(date) %>%
  do(update_b(.))

它不断遇到：

|=============                                                             | 13% \~40 s remaining  
Error in `$<-`:
! Assigned data `*vtmp*` must be compatible with existing data.
x Existing data has 1 row.
x Assigned data has 2 rows.
i Row updates require a list value. Do you need `list()` or `as.list()`?
Caused by error in `vectbl_recycle_rhs_rows()`:
! Can't recycle input of size 2 to size 1.

之前我一直在尝试使用 data.table 来解决这个问题：

#R
library(data.table)
calculate_b <- function(x) {
  for (i in 2:nrow(x)) {
    if (x[i, a] < 60 & x[i - 1, a] < 60) {
      x[i, b:= x[i, c] + x[i - 1, b]]
    } else {
      x[i, b:= 0]
    }
  }
  return(x)
}

a[, b:= 0]
a <- a[, calculate_b(.SD), by = number]

给了我一个错误

.SD is locked. Using := in .SD's j is reserved for possible future use; a tortuously flexible way to modify by group. Use := in j directly to modify by group by reference.

我该如何解决这个错误？

编辑：这是数据样本

ID（号码）	a	c	b（开始时设置为0）
123	30	0	0
123	25	45	45
123	18	8	53
123	80	15	0
123	45	63	0
123	15	75	75
123	70	12	0
456	65	0	0
456	45	75	0
456	30	26	26
456	58	95	121
456	53	41	162
456	50	32	194
789	45	0	0
789	90	14	0
789	89	65	0
789	75	78	0
789	80	59	0
789	50	32	0

Answer 1

您需要滚动计算。由于大多数滚动函数（以及所有在

data.table

内）都在单个向量上工作，因此滚动索引（每组）而不是实际值更简单，这样我们就可以毫无困难地访问多个变量的值。

DT[, bnew := 0]
DT[, bnew := frollapply(
  seq(.N), 2L, align="right", fill=0L,
  FUN = function(i) fifelse(!anyNA(a[i]) & all(a[i] < 60), c[i][2] + b[i][1], 0L)
  ), by = .(ID) ]
#        ID     a     c     b  bnew
#     <int> <int> <int> <int> <int>
#  1:   123    30     0     0     0
#  2:   123    25    45    45    45
#  3:   123    18     8    53    53
#  4:   123    80    15     0     0
#  5:   123    45    63     0     0
#  6:   123    15    75    75    75
#  7:   123    70    12     0     0
#  8:   456    65     0     0     0
#  9:   456    45    75     0     0
# 10:   456    30    26    26    26
# 11:   456    58    95   121   121
# 12:   456    53    41   162   162
# 13:   456    50    32   194   194
# 14:   789    45     0     0     0
# 15:   789    90    14     0     0
# 16:   789    89    65     0     0
# 17:   789    75    78     0     0
# 18:   789    80    59     0     0
# 19:   789    50    32     0     0

这一次查看两个值。第一个

n - 1

值（对于每个

ID

组）将设置为由

fill=

定义的静态常量，默认为

NA

。（可以通过使用

adaptive=TRUE

和

作为向量来进行“部分”滚动，但这里不需要，并且需要稍微调整我们的函数以更好地适应长度为 1 的

。）

之后，我们的

FUN

动作的第一次迭代给出了

i <- 1:2

，所以

a[i]

是

c(30, 25)

。请注意，所有 a

 在技术上都是可见的，因此我们将

a

 索引为当时想要使用

a[i]

 查看的两个。

下一步是，由于我们一次只查看

a

 的两个值，因此我稍微简化了代码以使用

anyNA(.)

和

all(.)

，既提高了代码效率，又提高了代码高尔夫。

您问题中的数据，为简单起见重命名：

DT <- data.table::as.data.table(structure(list(ID = c(123L, 123L, 123L, 123L, 123L, 123L, 123L, 456L, 456L, 456L, 456L, 456L, 456L, 789L, 789L, 789L, 789L, 789L, 789L), a = c(30L, 25L, 18L, 80L, 45L, 15L, 70L, 65L, 45L, 30L, 58L, 53L, 50L, 45L, 90L, 89L, 75L, 80L, 50L), c = c(0L, 45L, 8L, 15L, 63L, 75L, 12L, 0L, 75L, 26L, 95L, 41L, 32L, 0L, 14L, 65L, 78L, 59L, 32L), b = c(0L, 45L, 53L, 0L, 0L, 75L, 0L, 0L, 0L, 26L, 121L, 162L, 194L, 0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, -19L), class = c("data.table", "data.frame")))

修复更新数据并检查更新数据条件的循环

问题描述投票：0回答：1

1个回答

最新问题

ID（号码）	a	c	b（开始时设置为0）
123	30	0	0
123	25	45	45
123	18	8	53
123	80	15	0
123	45	63	0
123	15	75	75
123	70	12	0
456	65	0	0
456	45	75	0
456	30	26	26
456	58	95	121
456	53	41	162
456	50	32	194
789	45	0	0
789	90	14	0
789	89	65	0
789	75	78	0
789	80	59	0
789	50	32	0

ID（号码）	a	c	b（开始时设置为0）
123	30	0	0
123	25	45	45
123	18	8	53
123	80	15	0
123	45	63	0
123	15	75	75
123	70	12	0
456	65	0	0
456	45	75	0
456	30	26	26
456	58	95	121
456	53	41	162
456	50	32	194
789	45	0	0
789	90	14	0
789	89	65	0
789	75	78	0
789	80	59	0
789	50	32	0

修复更新数据并检查更新数据条件的循环

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1

ID（号码）	a	c	b（开始时设置为0）
123	30	0	0
123	25	45	45
123	18	8	53
123	80	15	0
123	45	63	0
123	15	75	75
123	70	12	0
456	65	0	0
456	45	75	0
456	30	26	26
456	58	95	121
456	53	41	162
456	50	32	194
789	45	0	0
789	90	14	0
789	89	65	0
789	75	78	0
789	80	59	0
789	50	32	0