我有一个R列表,如下所示。
[,1] [,2] [,3] [,4] [,5]
[1,] 6.939576 0.9102779 2.513760 3.838500 8.017567
[2,] 4.134372 2.1731401 6.627487 6.202576 9.603031
[3,] 6.303585 6.9664992 1.861797 3.507445 1.822297
[4,] 4.675198 4.2120635 6.429899 8.439339 9.593823
[5,] 6.472145 3.2654931 7.416211 2.056762 1.988843
[6,] 7.329604 3.8279722 5.085237 1.158770 1.278410
我想选择最高的前三行。差异,应该是:
[,1] [,2] [,3] [,4] [,5]
[1,] 6.939576 0.9102779 2.513760 3.838500 8.017567
[2,] 4.134372 2.1731401 6.627487 6.202576 9.603031
[6,] 7.329604 3.8279722 5.085237 1.158770 1.278410
谁能帮帮我?
嗨,马丁,我将假设你可以创建一个data.framework
library(tidyverse)
original_df <- data.table::fread("6.939576 0.9102779 2.513760 3.838500 8.017567
4.134372 2.1731401 6.627487 6.202576 9.603031
6.303585 6.9664992 1.861797 3.507445 1.822297
4.675198 4.2120635 6.429899 8.439339 9.593823
6.472145 3.2654931 7.416211 2.056762 1.988843
7.329604 3.8279722 5.085237 1.158770 1.278410")
original_df %>%
rowwise() %>%
mutate(variance = c_across(everything()) %>% var()) %>%
ungroup() %>%
slice_max(n = 3, order_by = variance)
#> # A tibble: 3 x 6
#> V1 V2 V3 V4 V5 variance
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 6.94 0.910 2.51 3.84 8.02 8.89
#> 2 4.13 2.17 6.63 6.20 9.60 7.81
#> 3 7.33 3.83 5.09 1.16 1.28 6.86
创建于2020-06-14 重读包 (v0.3.0)
如果你需要一个非常快速的矢量化代码 matrixStats 包有一个矢量化的方差函数。
original_df %>%
mutate(variance = across(everything()) %>% as.matrix() %>% matrixStats::rowVars(.)) %>%
slice_max(n = 3, order_by = variance)
我不知道你是如何定义方差的,我认为它可以被操作为最小值和最大值之间的差异。
数据。
set.seed(123)
df <- data.frame(
v1 = rnorm(10),
v2 = rnorm(10),
v3 = rnorm(10),
v4 = rnorm(10)
)
解决方案:
df$variance <- apply(df, 1, function(x) max(x) - min(x))
df[order(df$variance, decreasing = T),]
结果:
v1 v2 v3 v4 variance
6 1.71506499 1.7869131 -1.6866933 0.68864025 3.4736064
3 1.55870831 0.4007715 -1.0260044 0.89512566 2.5847128
1 -0.56047565 1.2240818 -1.0678237 0.42646422 2.2919055
8 -1.26506123 -1.9666172 0.1533731 -0.06191171 2.1199903
9 -0.68685285 0.7013559 -1.1381369 -0.30596266 1.8394928
10 -0.44566197 -0.4727914 1.2538149 -0.38047100 1.7266063
4 0.07050839 0.1106827 -0.7288912 0.87813349 1.6070247
5 0.12928774 -0.5558411 -0.6250393 0.82158108 1.4466203
2 -0.23017749 0.3598138 -0.2179749 -0.29507148 0.6548853
7 0.46091621 0.4978505 0.8377870 0.55391765 0.3768708
或者你也可以通过计算标准差来计算方差。sd
:
df$variance <- apply(df, 1, sd)
或者干脆使用 var
:
df$variance <- apply(df, 1, var)