我有一个女性和男性的频率表。
t1 <- table(PainF$task_duration)
t2 <- table(PainM$task_duration)
女性
30 40 45 60 65 70 75 78 80 90 95 100 101 120 144 150 180 185 240
3 3 2 5 1 2 1 1 1 5 1 1 1 3 1 1 1 1 2
男性:
2 10 15 20 30 38 40 45 50 55 60 70 72 73 75 80 90 95 100 105 110 120 130
2 2 1 2 3 1 4 4 3 2 11 1 1 1 1 2 10 1 1 1 1 11 2
150 180 200 240 300 500
2 5 1 3 3 1
这是表格,上面的数字是任务的持续时间(以分钟为单位),下面是人员的频率,我如何将这些数据放在散点图中,可能吗?我想比较女性和男性的数据(频率)和持续时间。
我尝试使用ggplot。
ggplot() +
geom_point(data = t1, aes(x = x, y = "column2"), color = "blue", size = 3) +
geom_point(data = t2, aes(x = x, y = "column2"), color = "red", size = 3) +
labs(x = "X", y = "Y", title = "Female and Male task duration") +
theme_minimal()
但我不断收到以下错误消息。
中的错误: !fortify()
必须是data
,或可被 强制转换的对象,或有效的 类似fortify()
的对象可被 强制。 由as.data.frame()
中的错误引起: !.prevalidate_data_frame_like_object()
必须返回长度为 2 的值。 运行dim(data)
查看错误发生的位置。rlang::last_trace()
所以我的问题是我可以根据频率表制作散点图吗?如果是的话我该怎么做?
> dput(t1)
structure(c(`30` = 3L, `40` = 3L, `45` = 2L, `60` = 5L, `65` = 1L,
`70` = 2L, `75` = 1L, `78` = 1L, `80` = 1L, `90` = 5L, `95` = 1L,
`100` = 1L, `101` = 1L, `120` = 3L, `144` = 1L, `150` = 1L, `180` = 1L,
`185` = 1L, `240` = 2L), dim = 19L, dimnames = list(c("30", "40",
"45", "60", "65", "70", "75", "78", "80", "90", "95", "100",
"101", "120", "144", "150", "180", "185", "240")), class = "table")
> dput(t2)
structure(c(`2` = 2L, `10` = 2L, `15` = 1L, `20` = 2L, `30` = 3L,
`38` = 1L, `40` = 4L, `45` = 4L, `50` = 3L, `55` = 2L, `60` = 11L,
`70` = 1L, `72` = 1L, `73` = 1L, `75` = 1L, `80` = 2L, `90` = 10L,
`95` = 1L, `100` = 1L, `105` = 1L, `110` = 1L, `120` = 11L, `130` = 2L,
`150` = 2L, `180` = 5L, `200` = 1L, `240` = 3L, `300` = 3L, `500` = 1L
), dim = 29L, dimnames = structure(list(c("2", "10", "15", "20",
"30", "38", "40", "45", "50", "55", "60", "70", "72", "73", "75",
"80", "90", "95", "100", "105", "110", "120", "130", "150", "180",
"200", "240", "300", "500")), names = ""), class = "table")
使用
dplyr
创建汇总数据可能会更容易。
但是,从您拥有的表格开始,我建议将它们组合成这样的数据框:
library(dplyr)
library(ggplot2)
counts <- rbind(
data.frame(t1) %>% mutate(gender = "F"),
data.frame(t2) %>% mutate(gender = "M")
) %>%
rename(
duration = Var1,
frequency = Freq
) %>%
mutate(
duration = as.integer(duration)
)
结果数据的顶部如下所示:
duration frequency gender
1 1 3 F
2 2 3 F
3 3 2 F
4 4 5 F
5 5 1 F
6 6 2 F
现在绘制。在这里使用单个数据框使事情变得更加简单。
ggplot(counts) +
geom_point(aes(x = duration, y = frequency, col = gender)) +
labs(x = "X", y = "Y", title = "Female and Male task duration") +
theme_minimal() +
scale_colour_manual(values = c("blue", "red"))