基于先前满足的几个条件的增量计数器

Question

我试图根据其他 2 列中满足的几个条件在 ID 列中创建增量计数器，然后“重置”满足这些条件以确定 ID 的下一个增量。这是时间序列数据，因此顺序很重要（我没有包含时间戳列）。

我将提供一个玩具数据集。我有 3 列：位置、活动和 ID。目前，我的 ID 列为空，但我已在此处填充了值来说明我的条件。我想从1开始初始化ID，然后我想检查D是否发生。这是我的第一个条件。然后，我需要检查 A 是否出现在 D 之后，在这种情况下，A 也应该位于位置 2。一旦满足此条件以及 D 条件，我想在下一行中将 ID 加 1。然后在下一行中，我想“重置”已发生的条件，然后再次逐行检查 D 是否发生，然后在 D 之后发生的位置 2 处的第一个 A 实例处，我想将下一行增加 1 .这会重复到数据集的最后。

df <- data.frame(
  Location = c(2, 3, 3, 2, 1, 2, 2, 2, 1, 3, 3, 1, 2, 3, 2, 2, 1, 2, 3, 2, 1),
  Activity = c("A", "B", "C", "D", "D", "B", "A", "A", "B", "A", "C", "D", "A", "B", "B", "D", "A", "D", "D", "A", "C"),
  ID = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4)
)

# Print the dataframe to view its structure
print(df)

   Location Activity ID
1         2        A  1
2         3        B  1
3         3        C  1
4         2        D  1
5         1        D  1
6         2        B  1
7         2        A  1
8         2        A  2
9         1        B  2
10        3        A  2
11        3        C  2
12        1        D  2
13        2        A  2
14        3        B  3
15        2        B  3
16        2        D  3
17        1        A  3
18        2        D  3
19        3        D  3
20        2        A  3
21        1        C  4
...

我已经尝试了某种条件逻辑的多次迭代，但它似乎失败了。我的最佳尝试如下，但它与我对 ID 列的期望不符。

# Function to increment ID based on conditions
increment_id_based_on_conditions <- function(df) {
  df$ID[1] <- 1  # Initialize the first ID
  
  # Initialize control variables
  waiting_for_a <- FALSE
  last_id <- 1
  
  for (i in 1:nrow(df)) {
    if (waiting_for_a && df$Activity[i] == "A" && df$Location[i] == 2) {
      last_id <- last_id + 1  # Increment ID after conditions are met
      waiting_for_a <- FALSE  # Reset condition
    } else if (df$Activity[i] == "D") {
      waiting_for_a <- TRUE  # Set condition to start waiting for "A" at Location 2
    }
    
    df$ID[i] <- last_id  # Update ID column
  }
  
  df$ID <- c(df$ID[-1], NA)  # Shift ID down by one row and make last ID NA
  return(df)
}

# Apply the function to  dataset
df_with_ids <- increment_id_based_on_conditions(df)

# View the updated dataset
print(df_with_ids)

 Location Activity ID
1         2        A  1
2         3        B  1
3         3        C  1
4         2        D  1
5         1        D  1
6         2        B  2
7         2        A  2
8         2        A  2
9         1        B  2
10        3        A  2
11        3        C  2
12        1        D  3
13        2        A  3
14        3        B  3
15        2        B  3
16        2        D  3
17        1        A  3
18        2        D  3
19        3        D  4
20        2        A  4
21        1        C NA

Answer 1

此解决方案为“D”创建组，并为每个组确定第一个“2A”位置。有了这些信息，就可以创建一个唯一的 ID。看：

df <- mutate(df, id = row_number())

aux <- df %>% 
  mutate(d_group = cumsum(if_else(activity == "D", 1, 0))) %>% 
  distinct(d_group, location, activity, .keep_all = TRUE) %>% 
  filter(location == 2, activity == "A", d_group > 0) %>% 
  pull(id)

df  <- mutate(df, id = cumsum(if_else(dplyr::lag(id) %in% aux, 1, 0)) + 1)

rm(aux)

# ---------
> df
   location activity id
1         2        A  1
2         3        B  1
3         3        C  1
4         2        D  1
5         1        D  1
6         2        B  1
7         2        A  1
8         2        A  2
9         1        B  2
10        3        A  2
11        3        C  2
12        1        D  2
13        2        A  2
14        3        B  3
15        2        B  3
16        2        D  3
17        1        A  3
18        2        D  3
19        3        D  3
20        2        A  3
21        1        C  4

基于先前满足的几个条件的增量计数器

问题描述投票：0回答：1

1个回答

最新问题

基于先前满足的几个条件的增量计数器

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1