我想创建一个数值数据的boxplot,但不包括在另一列上标为 "0 "的情况。

问题描述 投票:0回答:1

我已经为一个单一的因子制作了一个boxplot,如下所示。

ggplot(data = dataframe2, aes(x=factor(0), y = RPSdata$Survival.One.Year)) + geom_boxplot(...)

数据框是:

dataframe2 <- data.frame(RPSdata$Survival.One.Year)

我想制作同样的曲线图,但只包括RPSdata$Survival.Complete.Sense列中标为 "1 "的情况。

非常感谢您的帮助! 新的R所以感谢任何帮助

数据样本。

> dput(head(RPSdata, 5))
structure(list(ID.Rank = 1:5, ID.Participant = c("8571762481", 
"7351340719", "7396795819", "3790978753", "6450996320"), Population.Risk = structure(c(1L, 
2L, 3L, 2L, 2L), .Label = c("1", "2", "3", "4", "5", "6"), class = "factor"), 
    Personal.Risk = c(50, 60, 30, 40, 10), Comparative.Risk.Age = structure(c(2L, 
    NA, 3L, 4L, 3L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), 
    Comparative.Risk.Current = structure(c(NA, 3L, 3L, NA, NA
    ), .Label = c("1", "2", "3", "4", "5"), class = "factor"), 
    Comparative.Risk.Ex = structure(c(2L, 3L, NA, NA, 3L), .Label = c("1", 
    "2", "3", "4", "5"), class = "factor"), Score.Exposure = structure(c(1L, 
    1L, 1L, 2L, 1L), .Label = c("1", "2", "4", "5"), class = "factor"), 
    RF.Age = structure(c(1L, NA, 1L, 1L, 2L), .Label = c("0", 
    "1", "2"), class = "factor"), RF.Pollution = structure(c(1L, 
    NA, 3L, 2L, 2L), .Label = c("0", "1", "2"), class = "factor"), 
    RF.Asbestos = structure(c(1L, NA, 1L, 1L, 1L), .Label = c("1", 
    "2"), class = "factor"), RF.Asthma = structure(c(2L, NA, 
    3L, 2L, 1L), .Label = c("0", "1", "2"), class = "factor"), 
    RF.BMI = structure(c(2L, NA, 1L, 2L, 3L), .Label = c("0", 
    "1", "2"), class = "factor"), RF.Gene = structure(c(2L, NA, 
    3L, 3L, 3L), .Label = c("0", "1", "2"), class = "factor"), 
    RF.COPD = structure(c(2L, NA, 2L, 2L, 2L), .Label = c("0", 
    "1", "2"), class = "factor"), RF.History = structure(c(2L, 
    NA, 1L, 1L, 2L), .Label = c("0", "1", "2"), class = "factor"), 
    RF.Diet = structure(c(3L, NA, 1L, 2L, 3L), .Label = c("0", 
    "1", "2"), class = "factor"), RF.Radon = structure(c(2L, 
    NA, 1L, 3L, 3L), .Label = c("0", "1", "2"), class = "factor"), 
    RF.Smoking = structure(c(2L, NA, 2L, 2L, 2L), .Label = c("0", 
    "1", "2"), class = "factor"), RF.Second.Smoke = structure(c(3L, 
    NA, 1L, 3L, 2L), .Label = c("0", "1", "2"), class = "factor"), 
    Survival.One.Year = c(80, 20, NA, NA, 90), Survival.Five.Year = c(60, 
    50, NA, 30, 50), Survival.Ten.Year = c(40, 20, NA, NA, 2), 
    Worry.Frequency = structure(c(1L, 3L, 1L, 1L, 1L), .Label = c("1", 
    "2", "3", "4"), class = "factor"), Worry.Intensity = structure(c(1L, 
    2L, 2L, 2L, 1L), .Label = c("1", "2", "3", "4"), class = "factor"), 
    Mental.Health.One = structure(c(1L, 3L, 2L, 1L, 1L), .Label = c("0", 
    "1", "2", "3"), class = "factor"), Mental.Health.Two = structure(c(1L, 
    2L, 2L, 1L, 1L), .Label = c("0", "1", "2", "3"), class = "factor"), 
    Mental.Health.Three = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c("0", 
    "1", "2", "3"), class = "factor"), Mental.Health.Four = structure(c(2L, 
    2L, 1L, 1L, 1L), .Label = c("0", "1", "2", "3"), class = "factor"), 
    PHQ.4 = structure(c(2L, 5L, 3L, 1L, 1L), .Label = c("0", 
    "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", 
    "12"), class = "factor"), PHQ4.Anx = structure(c(1L, 4L, 
    3L, 1L, 1L), .Label = c("0", "1", "2", "3", "4", "5", "6"
    ), class = "factor"), PHQ4.Dep = structure(c(2L, 2L, 1L, 
    1L, 1L), .Label = c("0", "1", "2", "3", "4", "5", "6"), class = "factor"), 
    PHQ4.Bin = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("0", 
    "1", "2", "3"), class = "factor"), Dep.Bin = structure(c(1L, 
    1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), 
    Anx.Bin = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("0", 
    "1"), class = "factor"), Survival.Compelete.Sense = structure(c(2L, 
    1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), 
    Survival.Semi.Sense = c(1L, 0L, 0L, 1L, 1L)), row.names = c(NA, 
5L), class = "data.frame")
> 
r dataframe ggplot2 boxplot
1个回答
1
投票

鉴于问题的描述,不需要第二个data.frame。RPSdata 仅仅是需要。这个问题的解决方法是 子集 条件是一列必须等于 1.

library(ggplot2)

ggplot(data = subset(RPSdata, Survival.Complete.Sense == 1),
       mapping = aes(x = Survival.Complete.Sense, y = Survival.One.Year)) +
  geom_boxplot()

另一种选择,有一揽子计划 dplyr,是为了 filter 并将结果管到 ggplot. 我还胁迫 x 轴列的因素。

library(dplyr)
library(ggplot2)

RPSdata %>%
  filter(Survival.Complete.Sense == 1) %>%
  mutate(Survival.Complete.Sense = factor(Survival.Complete.Sense)) %>%
  ggplot(aes(Survival.Complete.Sense, Survival.One.Year)) +
  geom_boxplot()
© www.soinside.com 2019 - 2024. All rights reserved.