背景: 点双序列相关用于衡量二元变量 x 和连续变量 y 之间的关系。
方法: 我使用
cor.test()
函数来计算R
和p-value
:
# the two vectors
x <- mtcars$am
y <- mtcars$mpg
#calculate point-biserial correlation
cor_result <- cor.test(x, y)
cor_result$p.value
cor_result$estimate
我使用ggplot2以这种方式绘制它,点内的数字表示
cylinder
:
library(see) # theme_modern()
library(dplyr)
library(ggplot2)
# plot
mtcars %>%
mutate(am = factor(am)) %>%
mutate(id = row_number()) %>%
ggplot(aes(x=id, y=mpg, color=am, label = cyl )) +
geom_point(size = 8, alpha=0.5)+
geom_text(color = "black", hjust=0.5, vjust=0.5)+
scale_color_manual(values = c("steelblue", "purple"), labels = c("No", "Yes"))+
scale_x_continuous(breaks = 1:32, labels = 1:32)+
scale_y_continuous(breaks= scales::pretty_breaks())+
geom_text(aes(x = 10, y = 30,
label = ifelse(am == 0, "R = 0.5998324, p = 0.0002850207", "")),
color = "black",
size = 4) +
facet_wrap(. ~ am,
nrow = 1, strip.position = "bottom") +
labs(y = "mpg",
color="Automatic vs Manual transmission")+
theme_modern()+
theme(
aspect.ratio = 2,
strip.background = element_blank(),
strip.placement = "outside",
legend.position = "bottom",
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
text=element_text(size=16)
)
我的问题 你认为这是一个合适的数字来显示
am
和mpg
的相关性吗?
你能给我一个改进这个情节的提示吗?