如何解决R中的向量化问题?

问题描述 投票:0回答:2

R新手(ish)。我编写了一些在R中使用for()循环的代码。我想以矢量形式重写它,但是它不起作用。

简化示例说明:

library(dplyr)

x <- data.frame(name = c("John", "John", "John", "John", "John", "John", "John", "John", "Fred", "Fred"),
                year = c(1, NA, 2, 3, NA, NA, 4, NA, 1, NA))

## if year is blank and name is same as name from previous row
##    take year from previous row
## else
##    stick with the year you already have

# 1. Run as a loop

x$year_2 <- NA
x$year_2[1] <- x$year[1]                

for(row_idx in 2:10)
{
  if(is.na(x$year[row_idx]) & (x$name[row_idx] == x$name[row_idx - 1]))
  {
    x$year_2[row_idx] = x$year_2[row_idx - 1]
  }
  else
  {
    x$year_2[row_idx] = x$year[row_idx]
  }
}  

# 2. Attempt to vectorise

x <- data.frame(name = c("John", "John", "John", "John", "John", "John", "John", "John", "Fred", "Fred"),
                year = c(1, NA, 2, 3, NA, NA, 4, NA, 1, NA))

x$year_2 <- ifelse(is.na(x$year) & x$name == lead(x$name),
                   lead(x$year_2),
                   x$year)

我认为矢量化版本被弄乱了,因为它具有圆形性(即x$year_2出现在<-的两侧)。有没有办法解决这个问题?

谢谢。

r vectorization
2个回答
1
投票

我建议您使用已经建立的功能,R在开始时会感到困难,因为我们受过重新发明轮子的训练,请不要这样做。


0
投票

如果使用dplyr / tidyverse


0
投票

如果您知道数据框始终采用这种排序方式,那么以下各项将通过为NAs填充最新的非缺失值来为您工作。

© www.soinside.com 2019 - 2024. All rights reserved.