我有以下数据集:
df <- tribble(
~opp_id, ~month, ~count,
"304956938","Oct 2023","2",
"304956938","Oct 2023","2",
"305075384","Nov 2023","1",
"304910225","Dec 2023","2",
"305101457","Dec 2023","2",
"305005905","Feb 2024","1",
"305089124","Mar 2024","3",
"304955132","Mar 2024","3",
"304955132","Mar 2024","3",
"305005359","Jun 2024","2",
"304904187","Jun 2024","2",
"304973572","Aug 2024","1",
"304984865","Sep 2024","1",
)
我希望为每个“月”创建一个 1 到 100 之间的等距数字
基本上,不同月份有不同数量的机会。例如,“2023 年 10 月”有 2 个机会,“2024 年 3 月”有 3 个机会。然后,对于 2023 年 10 月,我希望将 100 减少 3(机会数量 + 1),并将两个机会分别放在 33.3 和 66.6 中。对于 2024 年 3 月,这三个机会的位置将分别为 25、50、75(顺序无关紧要)。理想的输出如下所示(对于我刚才提到的两个月)
~opp_id, ~month, ~count, ~position
"304956938","Oct 2023","2",33.3,
"304956938","Oct 2023","2",66.6,
"305089124","Mar 2024","3",25,
"304955132","Mar 2024","3",50,
"304955132","Mar 2024","3",75
我可以在每个组(月)内生成随机数,但我想不出一种方法来实现我上面提到的期望结果。
有了
dplyr
,你可以使用row_number() / (n() + 1)
:
library(dplyr)
df %>%
mutate(position = row_number() / (n() + 1) * 100, .by = month)
# # A tibble: 13 × 4
# opp_id month count position
# <chr> <chr> <chr> <dbl>
# 1 304956938 Oct 2023 2 33.3
# 2 304956938 Oct 2023 2 66.7
# 3 305075384 Nov 2023 1 50
# 4 304910225 Dec 2023 2 33.3
# 5 305101457 Dec 2023 2 66.7
# 6 305005905 Feb 2024 1 50
# 7 305089124 Mar 2024 3 25
# 8 304955132 Mar 2024 3 50
# 9 304955132 Mar 2024 3 75
# 10 305005359 Jun 2024 2 33.3
# 11 304904187 Jun 2024 2 66.7
# 12 304973572 Aug 2024 1 50
# 13 304984865 Sep 2024 1 50