使用summaryise_all[R]在dplyr组内进行t检验。

Question

比方说，我想用两种不同的货币来比较每个国家的苹果和橙子的价格。美国和BTC

美国~各国的水果 BTC~各国的水果。

library(tidyverse)

prices <- tibble(
  country = c(rep("USA", 6), rep("Spain", 6), rep("Korea", 6)),
  fruit = rep(c("apples", "apples", "apples", "oranges", "oranges", "oranges"), 3),
  price_USA = rnorm(18),
  price_BTC = rnorm(18)
)

prices %>% 
  group_by(country) %>% 
  summarise(
    pval_USA = t.test(price_USA ~ fruit)$p.value
    pval_BTC = t.test(price_BTC ~ fruit)$p.value
  )

现在假设有很多列，我想使用 summarise_all 而不是给每一列命名。是否有办法在每组中进行t检验(country)和每一列(price_USA, price_BTC)，使用 dplyr::summarise_all 函数？到目前为止，我试过的方法都给我出错。

prices %>% 
  group_by(country) %>% 
  summarise_at(
    c("price_USA", "price_BTC"),
    function(x) {t.test(x ~ .$fruit)$p.value}
  )
> Error in model.frame.default(formula = x ~ .$fruit) : 
  variable lengths differ (found for '.$fruit')

Answer 1

您可以通过以下方式实现将您的数据从宽幅调整为长幅。. 这是一个使用dplyr的解决方案。

library(tidyverse)

prices <- tibble(
  country = c(rep("USA", 6), rep("Spain", 6), rep("Korea", 6)),
  fruit = rep(c("apples", "apples", "apples", "oranges", "oranges", "oranges"), 3),
  price_USA = rnorm(18),
  price_BTC = rnorm(18)
)

prices %>% 
  pivot_longer(cols = starts_with("price"), names_to = "name",
               values_to = "price", names_prefix = "price_") %>%
  group_by(country, name) %>%
  summarise(pval = t.test(price ~ fruit)$p.value)
#> # A tibble: 6 x 3
#> # Groups:   country [3]
#>   country name   pval
#>   <chr>   <chr> <dbl>
#> 1 Korea   BTC   0.458
#> 2 Korea   USA   0.721
#> 3 Spain   BTC   0.732
#> 4 Spain   USA   0.526
#> 5 USA     BTC   0.916
#> 6 USA     USA   0.679

使用summaryise_all[R]在dplyr组内进行t检验。

问题描述投票：1回答：1

1个回答

最新问题

使用summaryise_all[R]在dplyr组内进行t检验。

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1