我正在尝试更长时间地旋转数据集,在该操作之后我需要添加计算。我面临的挑战是计算可能特定于每一行。
因此给出这个数据集:
df <- data.frame(
person = c("Alice", "Bob", "Charlie", "Jen","Zar"),
filed_taxes = c(1,0,1,0,0),
required_to_file_taxes =c(1,1,1,0,0),
bought_items = c(1,1,1,0,1),
required_to_buy_items = c(1,1,1,1,1),
took_vacation = c(1,0,0,1,1),
required_to_take_vacation = c(1,1,1,1,1)
)
以“必填”开头的列将用于计算该相关类别中总数的百分比。
所以输出看起来像这样:
Measure Completed % Completed
filed_taxes 2 66%
bought_items 4 80%
took_vacation 3 60%
我尝试先
pivot_longer
然后 pivot_wider
计算百分比,因为它们本质上是按行计算,然后再次 pivot_longer
以获得最终输出,但这不起作用。
有什么想法或建议吗?
library(tidyverse)
df |>
pivot_longer(-person) |>
summarize(Completed = sum(value),
Pct_Completed = mean(value), .by = name)
结果
# A tibble: 6 × 3
name Completed Pct_Completed
<chr> <dbl> <dbl>
1 filed_taxes 2 0.4
2 required_to_file_taxes 3 0.6
3 bought_items 4 0.8
4 required_to_buy_items 5 1
5 took_vacation 3 0.6
6 required_to_take_vacation 5 1