有效地计算数千个坐标对之间的距离

Question

我有一个我在python中打开的目录，它有各种对象的大约70,000行数据（ra，dec坐标和对象名称）。我还有另外一个大约15,000个感兴趣对象的列表，这些对象也出现在前面提到的目录中。对于这15,000个对象中的每一个，我想看看大型70,000列表中的任何其他对象是否在对象的10弧秒内具有ra，dec坐标。如果发现这是真的，我只想标记该对象并继续下一个。然而，这个过程需要很长时间，因为距离是在当前感兴趣的对象（15,000个）之间计算的70,000个不同时间。这需要几天时间！我怎样才能更有效地完成同样的任务？下面是我当前的代码，其中all_objects是所有15,000个感兴趣对象名称的列表，catalog是前面提到的70,000个对象的表数据。

from astropy.coordinates import SkyCoord
from astropy import units as u

for obj_name in all_objects:
    obj_ind = list(catalog['NAME']).index(obj_name)
    c1 = SkyCoord(ra=catalog['RA'][obj_ind]*u.deg, dec=catalog['DEC'][obj_ind]*u.deg, frame='fk5')
    for i in range(len(catalog['NAME'])):
        if i != obj_ind:
            # Compute distance between object and other source
            c2 = SkyCoord(ra=catalog['RA'][i]*u.deg, dec=catalog['DEC'][i]*u.deg, frame='fk5')
            sep = c1.separation(c2)
            contamination_flag = False
            if sep.arcsecond <= 10:
                contamination_flag = True
                print('CONTAMINATION FOUND')
                break

Answer 1

1 Create your own separation function

一旦你看一下实现并且问自己，这一步非常简单：“我怎样才能让它更快”

def separation(self, other):
    from . import Angle
    from .angle_utilities import angular_separation # I've put that in the code bellow so it is clearer

    if not self.is_equivalent_frame(other):
        try:
            other = other.transform_to(self, merge_attributes=False)
        except TypeError:
            raise TypeError('Can only get separation to another SkyCoord '
                            'or a coordinate frame with data')

    lon1 = self.spherical.lon
    lat1 = self.spherical.lat
    lon2 = other.spherical.lon
    lat2 = other.spherical.lat

    sdlon = np.sin(lon2 - lon1)
    cdlon = np.cos(lon2 - lon1)
    slat1 = np.sin(lat1)
    slat2 = np.sin(lat2)
    clat1 = np.cos(lat1)
    clat2 = np.cos(lat2)

    num1 = clat2 * sdlon
    num2 = clat1 * slat2 - slat1 * clat2 * cdlon
    denominator = slat1 * slat2 + clat1 * clat2 * cdlon

    return Angle(np.arctan2(np.hypot(num1, num2), denominator), unit=u.degree)

它会计算很多余弦和正弦值，然后创建一个Angle实例并转换为度数然后转换为弧秒。

您可能不想使用Angle，也不想在开头使用测试和转换，也不想在函数中进行导入，也不需要在需要性能时进行如此多的变量赋值。

分离功能对我来说感觉有点沉重，它应该只取数字并返回一个数字。

2 Use a quad tree (requires a complete rewrite of your code)

也就是说，让我们看一下算法的复杂性，它会检查每个元素与其他元素的对比，复杂度是O(n**2)（Big O表示法）。我们可以做得更好......

是您可以使用四叉树，最差情况下Quad树的复杂性是O（N）。这基本上意味着如果你不熟悉大O是15 000元素，查找将是15 000时代1元素而不是225 000 000时代（15 000平方）...相当改善权... Scipy有一个伟大的Quad树库（我总是使用自己的）。

有效地计算数千个坐标对之间的距离

问题描述投票：2回答：1

1个回答

1 Create your own separation function

2 Use a quad tree (requires a complete rewrite of your code)

最新问题

有效地计算数千个坐标对之间的距离

问题描述 投票：2回答：1

1个回答

1 Create your own separation function

2 Use a quad tree (requires a complete rewrite of your code)

最新问题

问题描述投票：2回答：1