在 matplotlib 中连接树状图

问题描述 投票:0回答:1

我目前正在开发一个项目,需要在 Python 中使用 matplotlib 可视化层次聚类树状图。我已成功为两个数据集(X1 和 X2)生成树状图,并将它们并排绘制。然而,我在将树状图连接在一起时遇到了困难。

之前我问过在并排子图中的树状图之间连接提示,但它使用的是plotly。

我尝试从两个树状图中提取尖端标签并对它们进行排序,但是当我尝试连接它们时,连接似乎放错位置或丢失。

这是我正在使用的代码片段:

import matplotlib.pyplot as plt
import scipy.cluster.hierarchy as hierarchy
import numpy as np

# Sample data for X1 , X2
np.random.seed(20240406)
X1 = np.random.rand(10, 12)
X2 = np.random.rand(10, 12)
names = ['Jack', 'Oxana', 'John', 'Chelsea', 'Mark', 'Alice', 'Charlie', 'Rob', 'Lisa', 'Lily']

# Plotting
fig, axes = plt.subplots(1, 3, figsize=(12, 6))

# Generate dendrogram structure for X1
Z1 = hierarchy.linkage(X1, method='complete')
dn1 = hierarchy.dendrogram(Z1, ax=axes[0], orientation='left', labels=names)
axes[0].set_title('Left Dendrogram for X1')
axes[0].set_xlabel('Distance')

# Generate dendrogram structure for X2
Z2 = hierarchy.linkage(X2, method='complete')
dn2 = hierarchy.dendrogram(Z2, ax=axes[2], orientation='right', labels=names)
axes[2].set_title('Right Dendrogram for X2')
axes[2].set_xlabel('Distance')

# Extract leaves and match them with names
leaves_left = dn1['leaves']
leaves_right = dn2['leaves']

# Use leaves and names to create connections
connections = []
for i in range(len(leaves_left)):
    left_name = names[leaves_left[i]]
    try:
        right_index = names.index(left_name)
    except ValueError:
        continue  # Skip to the next iteration if the name is not found
    connections.append((0, 1, i , right_index))


# Draw connections
for left, right, y_left, y_right in connections:
  axes[1].plot([left, right], [y_left, y_right], 'k-', alpha=0.5)

# Customize the third plot for connections
axes[1].set_title('Connections')
axes[1].set_xlabel('Connection')
axes[1].set_xlim(0, 1)  # Set limits for connection plot
axes[1].set_ylim(-0.5, len(names) - 0.5)  # Adjust y-axis limits for connections
axes[1].axis('off')

plt.tight_layout()
plt.show()

但您可以在图像中看到其尖端未正确链接,我的目标是使用将 X1 中的每个标签连接到 X2 中相应标签的线来连接树状图。如何正确实现树状图之间的这种联系?

python matplotlib dendrogram
1个回答
0
投票

我没有遵循树状图的叶子是如何排序的,但一种选择可能是获取每个图上的 y 刻度标签并将其匹配。我还在对

clip_on=False
的调用中传递了
plot
,以使线条末端具有更圆润的外观(而不是被轴剪切)。

import matplotlib.pyplot as plt
import scipy.cluster.hierarchy as hierarchy
import numpy as np

# Sample data for X1 , X2
np.random.seed(20240406)
X1 = np.random.rand(10, 12)
X2 = np.random.rand(10, 12)
names = ['Jack', 'Oxana', 'John', 'Chelsea', 'Mark', 'Alice', 'Charlie', 'Rob', 'Lisa', 'Lily']

# Plotting
fig, axes = plt.subplots(1, 3, figsize=(12, 6))

# Generate dendrogram structure for X1
Z1 = hierarchy.linkage(X1, method='complete')
dn1 = hierarchy.dendrogram(Z1, ax=axes[0], orientation='left', labels=names)
axes[0].set_title('Left Dendrogram for X1')
axes[0].set_xlabel('Distance')

# Generate dendrogram structure for X2
Z2 = hierarchy.linkage(X2, method='complete')
dn2 = hierarchy.dendrogram(Z2, ax=axes[2], orientation='right', labels=names)
axes[2].set_title('Right Dendrogram for X2')
axes[2].set_xlabel('Distance')

# Get hold of the labels for each dendrogram    
left_labels = axes[0].get_yticklabels()
right_labels = axes[2].get_yticklabels()
right_names = [label.get_text() for label in right_labels]

# Use label positions and texts to create connections
connections = []
for i, left_label in enumerate(left_labels):
    left_name = left_label.get_text()
    try:
        right_index = right_names.index(left_name)
    except ValueError:
        continue  # Skip to the next iteration if the name is not found
    connections.append((0, 1, left_label.get_position()[1] , right_labels[right_index].get_position()[1]))

# Draw connections
for left, right, y_left, y_right in connections:
  axes[1].plot([left, right], [y_left, y_right], 'k-', alpha=0.5, clip_on=False)

# Customize the third plot for connections
axes[1].set_title('Connections')
axes[1].set_xlabel('Connection')
axes[1].set_xlim(0, 1)  # Set limits for connection plot
axes[1].axis('off')

plt.tight_layout()
plt.show()

© www.soinside.com 2019 - 2024. All rights reserved.