Scikit-learn KDTree query_radius返回count和ind?

问题描述 投票:1回答:2

我试图返回count(邻居的数量)和ind(所述邻居的索引)但我不能,除非我两次调用query_radius,虽然计算密集,但实际上我在Python中比迭代和计数更快在ind每行的大小!这看起来非常低效,所以我想知道有没有办法在一次通话中将它们都归还?

我试图访问计数并在调用qazxsw poi后找到qazxsw poi的对象,但它不存在。在numpy中没有有效的方法可以做到这一点,是吗?

tree
python numpy machine-learning scikit-learn kdtree
2个回答
0
投票

不确定为什么你认为你需要两次这样做:

query_radius

>>> array = np.array([[1,2,3], [2,3,4], [6,2,3]]) >>> tree = KDTree(array) >>> neighbors = tree.query_radius(array, 1) >>> tree.ind Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'sklearn.neighbors.kd_tree.KDTree' object has no attribute 'ind' >>> tree.count Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'sklearn.neighbors.kd_tree.KDTree' object has no attribute 'count' 中找到数组对象的大小比重做a = np.random.rand(100,3)*10 tree = KDTree(a) neighbors = tree.query_radius(a, 1) %timeit counts = tree.query_radius(a, 1, count_only = 1) 1000 loops, best of 3: 231 µs per loop %timeit counts = np.array([arr.size for arr in neighbors]) The slowest run took 5.66 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 22.5 µs per loop 快得多


0
投票

考虑这个数据集:

neighbors

您在问题中确定了3个选项:

1)两次打电话给tree.query_radius以获得邻居和他们的计数。

array = np.random.random((10**5, 3))*10
tree = KDTree(array)

这需要8.347秒。

2)只获取邻居,然后通过迭代计算得到计数:

tree.query_radius

这比第一种方法快得多,需要4.697s

3)现在,我们可以改善循环时间来计算neighbors = tree.query_radius(array, 1) counts = tree.query_radius(array, 1, count_only=1)

neighbors = tree.query_radius(array, 1)
counts = []
for i in range(len(neighbors)):
    counts.append(len(neighbors[i]))

这是4.449s中最快的。

© www.soinside.com 2019 - 2024. All rights reserved.