如何通过（笛卡尔）坐标与python 2数据帧交叉匹配？

Question

我有 2 个天文目录，其中包含星系及其各自的天空坐标（赤经、赤纬）。我将目录作为数据框处理。这些目录来自不同的观测调查，并且有一些星系出现在两个目录中。我想交叉匹配这些星系并将它们放入一个新的目录中。我怎样才能用 python 做到这一点？我教过 numpy、pandas、astropy 或其他包应该有一些简单的方法，但我找不到解决方案？谢谢

Answer 1

经过大量研究，我发现最简单的方法是使用一个名为

astroml

的包，这里有一个教程。我用过的笔记本称为

cross_math_data_and_colour_cuts_.ipynb

和

PS_data_cleaning_and_processing.ipynb

。

from astroML.crossmatch import crossmatch_angular
# if you are using google colab use first the line "!pip install astroml"

df_1 = pd.read_csv('catalog_1.csv')
df_2 = pd.read_csv('catalog_2.csv')

# crossmatch catalogs
max_radius = 1. / 3600  # 1 arcsec
# note, that for the below to work the first 2 columns of the catalogs should be ra, dec
# also, df_1 should be the longer of the 2 catalogs, else there will be index errors
dist, ind = crossmatch_angular(df_1.values, df_2.values, max_radius)
match = ~np.isinf(dist)
# THE DESIRED SOLUTION IS THEN:
df_crossed = df_1[match]


# ALTERNATIVELY:
# ind contains the indices of the cross-matched galaxies in respect to the second catalog,
# when there is no match it the kind value is the length of the first catalog
# so if you necessarily have to work with the indices of the second catalog, instead of the first, do:
df_2['new_var'] = [df_2.old_var[i] if i<len(df_2) else -999 for i in mind]
# that way whenever you have a match 'new_var' will contain the correct value from 'old_var'
# and whenever you have a mismatch it will contain -999 as a flag

如果一个人处于方便的位置，不仅在两个数据帧中拥有坐标，而且还拥有匹配源的 ID，那么就可以轻松地与 pandas .merge() 函数进行交叉匹配。假设我们在

df_1

中有

'ID', 'ra', 'dec', 'object_class'

列，在

df_2

中有

'ID', 'ra', 'dec', 'r_mag'

，那么我们可以与

进行交叉匹配

df_crossed = pd.merge(df_1, df_2, on='ID')

默认情况下，这将进行

inner

交叉匹配（请参阅此处了解更多详细信息）。生成的

df_crossed

将具有列

'ID', 'ra', 'dec', 'object_class', 'r_mag'

。

如何通过（笛卡尔）坐标与python 2数据帧交叉匹配？

问题描述投票：0回答：1

1个回答

最新问题

如何通过（笛卡尔）坐标与python 2数据帧交叉匹配？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1