如何在R中按行的方差进行排序和选择行

问题描述 投票:0回答:2

我有一个R列表,如下所示。

         [,1]      [,2]     [,3]     [,4]     [,5]
[1,] 6.939576 0.9102779 2.513760 3.838500 8.017567
[2,] 4.134372 2.1731401 6.627487 6.202576 9.603031
[3,] 6.303585 6.9664992 1.861797 3.507445 1.822297
[4,] 4.675198 4.2120635 6.429899 8.439339 9.593823
[5,] 6.472145 3.2654931 7.416211 2.056762 1.988843
[6,] 7.329604 3.8279722 5.085237 1.158770 1.278410

我想选择最高的前三行。差异,应该是:

         [,1]      [,2]     [,3]     [,4]     [,5]
[1,] 6.939576 0.9102779 2.513760 3.838500 8.017567
[2,] 4.134372 2.1731401 6.627487 6.202576 9.603031
[6,] 7.329604 3.8279722 5.085237 1.158770 1.278410

谁能帮帮我?

r list rank variance
2个回答
1
投票

嗨,马丁,我将假设你可以创建一个data.framework

library(tidyverse)

original_df <- data.table::fread("6.939576 0.9102779 2.513760 3.838500 8.017567
4.134372 2.1731401 6.627487 6.202576 9.603031
6.303585 6.9664992 1.861797 3.507445 1.822297
4.675198 4.2120635 6.429899 8.439339 9.593823
6.472145 3.2654931 7.416211 2.056762 1.988843
7.329604 3.8279722 5.085237 1.158770 1.278410")


original_df %>%
  rowwise() %>%
  mutate(variance = c_across(everything()) %>% var()) %>%
  ungroup() %>%
  slice_max(n = 3, order_by = variance)
#> # A tibble: 3 x 6
#>      V1    V2    V3    V4    V5 variance
#>   <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>
#> 1  6.94 0.910  2.51  3.84  8.02     8.89
#> 2  4.13 2.17   6.63  6.20  9.60     7.81
#> 3  7.33 3.83   5.09  1.16  1.28     6.86

创建于2020-06-14 重读包 (v0.3.0)

如果你需要一个非常快速的矢量化代码 matrixStats 包有一个矢量化的方差函数。

original_df %>%
  mutate(variance = across(everything()) %>% as.matrix() %>% matrixStats::rowVars(.)) %>% 
  slice_max(n = 3, order_by = variance)

0
投票

我不知道你是如何定义方差的,我认为它可以被操作为最小值和最大值之间的差异。

数据。

set.seed(123)
df <- data.frame(
  v1 = rnorm(10),
  v2 = rnorm(10),
  v3 = rnorm(10),
  v4 = rnorm(10)
)

解决方案:

df$variance <- apply(df, 1, function(x) max(x) - min(x))
df[order(df$variance, decreasing = T),]

结果:

            v1         v2         v3          v4  variance
6   1.71506499  1.7869131 -1.6866933  0.68864025 3.4736064
3   1.55870831  0.4007715 -1.0260044  0.89512566 2.5847128
1  -0.56047565  1.2240818 -1.0678237  0.42646422 2.2919055
8  -1.26506123 -1.9666172  0.1533731 -0.06191171 2.1199903
9  -0.68685285  0.7013559 -1.1381369 -0.30596266 1.8394928
10 -0.44566197 -0.4727914  1.2538149 -0.38047100 1.7266063
4   0.07050839  0.1106827 -0.7288912  0.87813349 1.6070247
5   0.12928774 -0.5558411 -0.6250393  0.82158108 1.4466203
2  -0.23017749  0.3598138 -0.2179749 -0.29507148 0.6548853
7   0.46091621  0.4978505  0.8377870  0.55391765 0.3768708

或者你也可以通过计算标准差来计算方差。sd:

df$variance <- apply(df, 1, sd)

或者干脆使用 var:

df$variance <- apply(df, 1, var)
© www.soinside.com 2019 - 2024. All rights reserved.