我有一个邻接矩阵,我正在使用它作为我的预损害距离矩阵。我不想找到所有最近点的所有最近点,而是只想将彼此靠近的点分组。
例如:
import numpy as np
from sklearn.cluster import DBSCAN
# distance matrix (cosine calculated)
adj = np.array([
[1,1,0,1,0,1,1],
[1,1,0,0,1,1,1],
[0,0,1,1,1,0,0],
[1,0,1,1,1,0,0],
[0,1,1,1,1,1,1],
[1,1,0,0,1,1,1],
[1,1,0,0,1,1,1]])
# run through DBSCAN
D_fit = DBSCAN(eps = .99,min_samples=2,metric='precomputed').fit(adj)
print(D_fit.labels_)
通常 DBSCAN 会将所有内容组合在一起;
[0 0 0 0 0 0 0]
但是,如果我们只对所有相互接近的点进行分组:[0 0 1 1 1 2 2]
或 [0 0 1 1 2 2 2]
或 [0 0 1 1 1 0 0]
......我正在寻找这种分组方法。 是否有工具或包或某种方法来对相互靠近而不是网络分组的点进行分组?
这不是最有效的方法,但这是我让它工作的方法;
import numpy as np
import networkx as nx
import random
#set default / starting values
d = len(adj[0])
df_edge= nx.from_numpy_array(adj)
combos = list(nx.enumerate_all_cliques(nx.Graph(df_edge)))
cluster = [-1 for _ in range(d)]
select = random.choice(combos)
c=0
# set cluster value for starting cluster
for i in select:
cluster[i] = c
c += 1
# continue as long as there are unassigned points
while -1 in cluster:
# list everything that is ungrouped
ind = [i for i, x in enumerate(cluster) if x == -1]
possible = []
# combine all lists from cobos that don't contain any grouped points.
for j in combos:
if all(ele in ind for ele in j) == True:
possible.append(j)
# select at random from possible combinations and set as new select value.
select = random.choice(possible)
# update cluster list
for i in select:
cluster[i] = c
c += 1
return cluster