我有一个如下所示的矩阵,但有
n
行
set.seed(123)
mt <- replicate(5, sample(1:3, 4, replace = TRUE))
mt
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 3 3 3 1 3
#> [2,] 3 2 1 2 3
#> [3,] 3 2 2 3 1
#> [4,] 2 2 2 1 1
为了逐行排序,我将这段代码与
order
函数一起使用,类似于here:
od2 <- order(mt[1, ], mt[2, ], mt[3, ], mt[4, ])
mt[, od2]
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 3 3 3 3
#> [2,] 2 1 2 3 3
#> [3,] 3 2 2 1 3
#> [4,] 1 2 2 1 2
有什么方法可以使此代码适应
n
行?作为 comment 给出的答案的第二个版本我没有成功。我不熟悉do.call
功能。
do.call
需要 a list
作为函数调用的参数。 因此您需要将矩阵的行拆分为一个列表。函数 asplit
可以通过 split an array or matrix by its margins.
mt[,do.call(order, asplit(mt, 1))]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 3 3 3 3
#[2,] 2 1 2 3 3
#[3,] 3 2 2 1 3
#[4,] 1 2 2 1 2
asplit
当前(4.2.3)通过选择切片并在 for
循环中填充列表来执行此操作。其他选项可以在converting a matrix to a list或Convert a matrix to a list of column-vectors.
将一些可能的方法与更大的矩阵进行比较:
m <- matrix(sample(0:9, 1e6, TRUE), 1e3)
bench::mark(
asplit = m[,do.call(order, asplit(m, 1))],
data.frame = m[, do.call(order, data.frame(t(m)))],
splitRow = m[, do.call(order, split(m, row(m)))],
lapply = m[,do.call(order, lapply(1:nrow(m), function(i) m[i,]))],
splitNrow = m[, do.call(order, split(m, 1:nrow(m)))],
apply = m[, do.call(order, apply(m, 1, identity, simplify = FALSE))],
tapply = m[, do.call(order, tapply(m, row(m), identity))]
)
结果
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
<bch:expr> <bch:t> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
1 asplit 8.44ms 9.02ms 111. 19.25MB 73.8 30 20 271ms
2 data.frame 6.88ms 7.24ms 138. 15.47MB 80.3 36 21 261.4ms
3 splitRow 19.79ms 19.98ms 48.9 31.01MB 73.3 10 15 204.6ms
4 lapply 6.79ms 6.99ms 141. 11.57MB 33.8 54 13 384.2ms
5 splitNrow 8.36ms 8.55ms 117. 7.76MB 16.3 50 7 428.8ms
6 apply 7.53ms 7.74ms 128. 15.38MB 50.6 43 17 335.9ms
7 tapply 27.64ms 28.47ms 35.4 38.73MB 165. 3 14 84.7ms
方法
data.frame
和 lapply
是最快的并且 splitNrow
分配的额外内存量最少,但是 asplit
并没有落后太多并且更通用并且也适用于数组并且可以轻松更改边距。
调整this answer以使用一堆
t
ranspose按行排序,你可以这样做:
t(t(mt)[do.call(order, as.data.frame(t(mt))),])
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 3 3 3 3
# [2,] 2 1 2 3 3
# [3,] 3 2 2 1 3
# [4,] 1 2 2 1 2
或者,更简单(根据@Ronak Shah 的评论):
mt[, do.call(order, data.frame(t(mt)))]
带有
split
的选项
mt[, do.call(order, split(mt, row(mt)))]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 3 3 3
[2,] 2 1 2 3 3
[3,] 3 2 2 1 3
[4,] 1 2 2 1 2