为什么geom_box没有为每个因素绘制单独的箱线图?

问题描述 投票:0回答:1

我在绘制小提琴图和箱线图时遇到问题,这将显示我的数据样本的描述性统计数据。 我有一个包含 3 列变量的数据:channel、Year、durationMins,有 20 个观察值。

这是我的代码:

我想在 ShinyApp 中构建带有箱线图的小提琴图。

#Simulate Data for Reproducible Code
# Set the number of observations
{n <- 512

# Define channel names
channels <- c("Channel_A", "Channel_B", "Channel_C", "Channel_D")

# Define months and days of the week
months <- c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")
days <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")

# Create a data frame to store the data
df1 <- data.frame()

# Generate data for each channel
for (channel in channels) {
  # Generate data for each year
  for (year in 2017:2024) {
    # Generate data for each month
    for (month in months) {
      # Sample durations for each day of the month
      for (day in 1:30) {  # Assuming 30 days per month
        # Sample duration for the specific channel, year, month, and day
        durations <- round(runif(n = 1, min = 4.59, max = 30.7), 1)
        # Sample a random day of the week
        day_of_week <- sample(days, 1)
        # Append the data to the data frame
        df1 <- rbind(df1, data.frame(channel = channel, Year = year, Month = month, Day = day, publishedDayName = day_of_week, durationMins = durations))
      }
    }
  }
}

# Randomly shuffle the rows of the data frame
df1 <- df1[sample(nrow(df1)), ]}

# View the first few rows of the data frame
head(df1)

#ShinyApp
ui<-
  fluidPage(
    tags$script(HTML('$(document).ready(function() {
        // Get the position of the sidebar
        var sidebarPosition = $(".sidebar").offset().top;

        // Function to fix or unfix the sidebar based on scrolling
        function fixSidebar() {
          var scrollTop = $(window).scrollTop();

          if (scrollTop > sidebarPosition) {
            $(".sidebar").addClass("fixed-sidebar");
          } else {
            $(".sidebar").removeClass("fixed-sidebar");
          }
        }

        // Attach the function to the scroll event
        $(window).scroll(fixSidebar);

        // Call the function once to set the initial state
        fixSidebar();
      });
    ')),
  
    titlePanel("Youtube Data science Channels Videon Duration Analytics"),
    mainPanel(plotOutput('dur_Plot'))
)

server <- function(input, output) {
  
  library(dplyr)
  library(ggplot2)
  library(lubridate)
  
  #Duration plot
  duration_pipeline_f <- reactive({
    df1 %>% 
      group_by(channel, Year) %>%
      summarize(durationMins = mean(durationMins))%>% 
      arrange(Year, .by_group = T) 
  })
  
  output$dur_Plot  <-renderPlot({
    
    ggplot(duration_pipeline_f(), aes(x = Year, y = durationMins, fill = channel)) +
      geom_violin(position = 'dodge', trim = FALSE, linewidth = .5, alpha = .9) +
      geom_boxplot(width = 0.7, position = position_dodge(width = 0.9), fill = 'white', color = 'black') +
      labs(title = 'Video duration distribution', y = 'Duration in mins') +
      scale_fill_brewer(palette = 'Spectral') + 
      theme(text = element_text(size = 12))
    
  })
}
shinyApp(ui, server)

在服务器部分的这段代码中,我过滤了数据,这就是为什么我的观察结果比原始样本少。也许这个模拟会帮助您理解我的问题。

This is the view of my shinyApp page with a mentioned plot

r ggplot2 shiny boxplot
1个回答
0
投票

实际上你的问题与闪亮无关,通过关注

ggplot2
代码会更容易解决。

问题只是分组。首先,当您在

channel
上映射时,您的小提琴图会按
fill
隐式分组,而对于箱线图,您可以通过设置
fill="white"
覆盖分组。因此,对于箱线图,您必须明确映射到
group
aes。此外,由于您的
Year
变量是一个数字,您会得到显示所有年份分布的小提琴(和/或箱线图)。当您将
Year
映射到
x
时,您可能需要每年的小提琴/箱线图。在这种情况下,您也必须将
Year
转换为
factor
或按
Year
显式分组,如下所示。

注意:我使用非聚合数据集,恕我直言,如果您想显示分布,聚合数据集是没有意义的。

library(ggplot2)
library(shiny)

server <- function(input, output) {
  duration_pipeline_f <- reactive({
    df1
  })

  output$dur_Plot <- renderPlot({
    ggplot(duration_pipeline_f(), aes(
      x = Year, 
      y = durationMins,
      fill = channel,
      group = interaction(Year, channel)
    )) +
      geom_violin(
        position = position_dodge(width = 0.9), 
        trim = FALSE,
        linewidth = .5, alpha = .9
      ) +
      geom_boxplot(
        aes(group = interaction(Year, channel)),
        width = 0.7,
        position = position_dodge(width = 0.9),
        fill = "white", color = "black"
      ) +
      labs(title = "Video duration distribution", y = "Duration in mins") +
      scale_fill_brewer(palette = "Spectral") +
      theme(text = element_text(size = 12))
  })
}
shinyApp(ui, server)
#> 
#> Listening on http://127.0.0.1:6007

© www.soinside.com 2019 - 2024. All rights reserved.