函数 get_cum_dist 定义如下:
from numba import njit
import numpy as np
@njit(fastmath=True)
def get_cum_dist(perm: np.ndarray, c: np.ndarray, n: int) -> np.array:
cum_dist = np.empty(n)
cum_dist[0] = 0.
cum_dist[1] = 0.
for i in range(1, n - 1):
cum_dist[i + 1] = cum_dist[i] + c[perm[i - 1], perm[i]]
return cum_dist
对于输入,
n = 1000
perm = np.random.permutation(n)
c = np.random.random((n+1,n+1))
cum_dist = get_cum_dist(perm, c, n)
此函数在我的算法中被调用多次,任何有关潜在加速的建议都将受到高度赞赏!
这是 C++ 中的一次尝试,没有随机排列输入(我是 C++ 菜鸟)。但是当我使用 -O3 flat 编译此代码时,我发现比 numba 版本快 10 倍。
#include <iostream>
using namespace std;
#include <bits/stdc++.h>
#include <chrono>
using namespace std::chrono;
int main() {
int n, sum = 0;
int rows = 1001;
int cols = 1001;
int randArr[rows][cols];
for (int i=0;i<rows;i++)
for (int j=0; j<cols; j++)
randArr[i][j] = 1 + (rand() % 500);
n = 1000;
int arr[1000]={0};
auto start = high_resolution_clock::now();
for (int i = 1; i <= n; ++i) {
arr[i+1]=arr[i] + randArr[i-1][i];
}
auto stop = high_resolution_clock::now();
auto duration = duration_cast<nanoseconds>(stop - start);
cout << duration.count() << endl;
return 0;
@463035818_is_not_an_ai 非常感谢!我对 C++ 代码做了一些修改,现在应该生成有意义的时序。
#include <iostream>
using namespace std;
#include <bits/stdc++.h>
#include <chrono>
using namespace std::chrono;
#include <algorithm>
#include <vector>
#include <cstdlib>
int main() {
int n, sum = 0;
int rows = 1001;
int cols = 1001;
int randArr[rows][cols];
for (int i=0;i<rows;i++)
for (int j=0; j<cols; j++)
randArr[i][j] = 1 + (rand() % 500);
vector<int> myvector;
n = 1000;
for (int i=1;i<n;++i) myvector.push_back(i);
std::random_shuffle(myvector.begin(), myvector.end());
int arr[1001]={0};
auto start = high_resolution_clock::now();
for (int i = 1; i < n; ++i) {
arr[i+1]=arr[i] + randArr[myvector[i-1]][myvector[i]];
}
auto stop = high_resolution_clock::now();
auto duration = duration_cast<nanoseconds>(stop - start);
cout << duration.count() << endl;
cout << arr[1000] << endl;
return 0;
}
我打印了 arr 中的值,这样结果是有用的,希望编译器不会优化掉时间部分。