R:跨数据帧中多列循环时出错

问题描述 投票:0回答:1

我正在尝试执行一个简单的循环,通过使用现有列向数据框中添加新列。考虑以下 1986 年至 1989 年老年人口的数据集:

col.names <- c("Age","Pop86","Pop87","Pop88","Pop89")
col1 <- c(76,77,78,79,80,81,82,83,84,85)
col2 <- c(102,106,91,85,67,77,54,47,36,39)
col3 <- c(111,96,96,76,76,62,60,49,42,35)
col4 <- c(108,101,88,88,69,67,56,49,42,32)
col5 <- c(141,100,96,80,78,60,49,43,38,25)
df1 <- data.frame(col1,col2,col3,col4,col5)
colnames(df1) <- col.names

   Age Pop86 Pop87 Pop88 Pop89
1   76   102   111   108   141
2   77   106    96   101   100
3   78    91    96    88    96
4   79    85    76    88    80
5   80    67    76    69    78
6   81    77    62    67    60
7   82    54    60    56    49
8   83    47    49    49    43
9   84    36    42    42    38
10  85    39    35    32    25

我写了一个简单的循环,大致显示了我想要的内容:

num_iterations <- 4
for (i in 1:num_iterations) {
  df1[[paste0("Col", i)]] <- lag(df1$Pop86,default = 0)- df1$Pop87
}
   Age Pop86 Pop87 Pop88 Pop89 Col1 Col2 Col3 Col4
1   76   102   111   108   141 -111 -111 -111 -111
2   77   106    96   101   100    6    6    6    6
3   78    91    96    88    96   10   10   10   10
4   79    85    76    88    80   15   15   15   15
5   80    67    76    69    78    9    9    9    9
6   81    77    62    67    60    5    5    5    5
7   82    54    60    56    49   17   17   17   17
8   83    47    49    49    43    5    5    5    5
9   84    36    42    42    38    5    5    5    5
10  85    39    35    32    25    1    1    1    1

现在,每个添加的列都是相同的,并且基于 Pop86 和 Pop87 列。我当然想让循环跨列移动,使用对 (Pop86,Pop87),然后使用 (Pop87,Pop88) 等等,但我的尝试失败了:

num_iterations <- 4  # Adjust as needed
for (i in 1:num_iterations) {
  col1 <- paste0("Pop", 87 + i - 1)
  col2 <- paste0("Pop", 87 + i)
  df1[[paste0("Col", i)]] <- lag(df1[[col1]],default = 0) - df1[[col2]]
}

Error in `[[<-.data.frame`(`*tmp*`, paste0("Col", i), value = numeric(0)) : 
  replacement has 0 rows, data has 10

我正在寻找循环这些列的正确方法。由于这些列是更大数据集的一部分,我希望循环能够遍历所有 PopXX 列。

r dataframe loops iteration demographics
1个回答
0
投票

无循环:

library(tidyverse)

df1 %>% 
  pivot_longer(!Age) %>%  
  mutate(diff = value - lag(value),
         col = str_c("Col_", row_number()),
         .by = Age) %>% 
  select(Age, diff, col) %>% 
  pivot_wider(names_from = col, 
              values_from = diff) %>% 
  left_join(df1, .)

# A tibble: 10 × 9
     Age Pop86 Pop87 Pop88 Pop89 Col_1 Col_2 Col_3 Col_4
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1    76   102   111   108   141    NA     9    -3    33
 2    77   106    96   101   100    NA   -10     5    -1
 3    78    91    96    88    96    NA     5    -8     8
 4    79    85    76    88    80    NA    -9    12    -8
 5    80    67    76    69    78    NA     9    -7     9
 6    81    77    62    67    60    NA   -15     5    -7
 7    82    54    60    56    49    NA     6    -4    -7
 8    83    47    49    49    43    NA     2     0    -6
 9    84    36    42    42    38    NA     6     0    -4
10    85    39    35    32    25    NA    -4    -3    -7
© www.soinside.com 2019 - 2024. All rights reserved.