R - 创建一个列，指示另一列是否具有相同的值

Question

我有一个如下所示的数据集：

df
# A tibble: 21 × 2
   animals_id clus_ID
   <chr>        <int>
 1 L085           246
 2 L085           246
 3 L085           246
 4 L084           247
 5 L084           247
 6 L084           247
 7 L085           249
 8 L084           249
 9 L084           249
10 L087           249

我想创建另一列“

type

”，告诉我

animals_id

内的

clus_ID

是否不同（也就是说，它是只有一只动物还是更多）。它应该看起来像这样：

   animals_id clus_ID   type
 1 L085           246   A
 2 L085           246   A
 3 L085           246   A
 4 L084           247   A
 5 L084           247   A
 6 L084           247   A
 7 L085           249   B
 8 L084           249   B
 9 L084           249   B
10 L087           249   B

在这个问题之后，我创建了以下代码：

 df %>% group_by(clus_ID) %>% mutate(test = ifelse(length(unique(df[,"animals_id"]))==1, "A", "B"))

和

 df %>% group_by(clus_ID) %>% mutate(type = ifelse(n_distinct(animals_id) == 1, "A", "B"))

但是这些都不起作用，要么都是“A”，要么都是“B”......有什么想法吗？

用于复制的数据集：

> dput(df)
structure(list(animals_id = c("L085", "L085", "L085", "L084", 
"L084", "L084", "L085", "L084", "L084", "L087", "L084", "L084", 
"L084", "L084", "L084", "L084", "L084", "L084", "L084", "L084", 
"L084"), clus_ID = c(246L, 246L, 246L, 247L, 247L, 247L, 249L, 
249L, 249L, 249L, 249L, 249L, 249L, 249L, 249L, 249L, 249L, 249L, 
249L, 249L, 249L)), class = "data.frame", row.names = c(366428L, 
366429L, 366430L, 349169L, 349170L, 349171L, 366435L, 349185L, 
349186L, 378191L, 349343L, 349345L, 349346L, 349347L, 349477L, 
349478L, 349479L, 349480L, 349706L, 349869L, 350121L))

Answer 1

您可以将 A 簇表达为最小和最大

cluster_id

具有相同值的簇：

df %>%
group_by(clus_ID) %>%
mutate(test = ifelse(min(animals_id) == max(animals_id), "A", "B"))

Answer 2

我认为做到这一点的方法是使用

n_distinct()

：

df |>
    mutate(
        type = if_else(
            n_distinct(animals_id) == 1,
            "A",
            "B"
        ),
        .by = clus_ID
    )

R - 创建一个列，指示另一列是否具有相同的值

问题描述投票：0回答：2

2个回答

最新问题

R - 创建一个列，指示另一列是否具有相同的值

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2