添加新列表以键入双倍for循环

问题描述 投票:0回答:1

我正在进行KMeans聚类,在此之前我也进行了主成分分析。我正在尝试寻找不错的可用集群。为此,我想自动查看哪个聚类数k值为每个p提供最佳的轮廓分数。

在我的double for循环中,我在第一个for循环中应用了具有多个主要成分的PCA,然后在第二个嵌套的for循环中应用了多个k。最终,我想要一个字典,该字典显示k的数量和每个p的轮廓分数。这是我当前的功能:

def optimal_clust(df_scaled, minPCA, maxPCA, minClust, maxClust):
    p = 'Number of PCs'
    k = 'Number of k'
    silhouette = 'Silhouette score'
    clustdict = {p :[k, silhouette]}
    for p in range(minPCA, maxPCA):
        pca = PCA(n_components = p)
        df_pca = pca.fit_transform(df_scaled)
        for k in range(minClust, maxClust+1):
            kmeans_labels = KMeans(n_clusters = k, random_state = 0).fit_predict(df_pca)
            silhouette = silhouette_score(df_pca, kmeans_labels)
            clustdict[p] = []
            clustdict[p].append([k, silhouette])

    return clustdict

print(optimal_clust(df_scaled, minPCA, maxPCA, minClust, maxClust))

这只为我提供了每个p的maxClust + 1值的字典,为5。输出如下:

{'Number of PCs': ['Number of k', 'Silhouette score'], 1: [[5, 0.5242417773868049]], 2: [[5, 0.3274181367447551]], 3: [[5, 0.267904945833515]], 4: [[5, 0.22204357317276344]], 5: [[5, 0.1917496386757678]], 6: [[5, 0.16193197736304277]], 7: [[5, 0.14803935348320568]]}

我该如何解决这个问题,以便使我得到完整的结果?当我选择打印而不是将其存储在字典中时,它确实给了我一切。谢谢。

python machine-learning k-means pca
1个回答
0
投票

所以,我摆弄defeaultdict并幸运地找到了解决方案:

def optimal_clust(df_scaled, minPCA, maxPCA, minClust, maxClust):
    clustdict = defaultdict(list)
    for p in range(minPCA, maxPCA):
        pca = PCA(n_components = p)
        df_pca = pca.fit_transform(df_scaled)
        for k in range(minClust, maxClust+1):
            kmeans_labels = KMeans(n_clusters = k, random_state = 0).fit_predict(df_pca)
            silhouette = silhouette_score(df_pca, kmeans_labels)
            clustdict[p].append([k, silhouette])
    return clustdict

print(optimal_clust(df_scaled, minPCA, maxPCA, minClust, maxClust))

这给了我:

defaultdict(<class 'list'>, {1: [[2, 0.5607920149433261], [3, 0.5399029168499861], [4, 0.524472082127441], [5, 0.5242417773868053]], 2: [[2, 0.38034477108342357], [3, 0.33609893188462264], [4, 0.3569575287929635], [5, 0.3274181367447551]], 3: [[2, 0.3140852723397097], [3, 0.2562260449736865], [4, 0.2617649481080593], [5, 0.26790494583351326]], 4: [[2, 0.27246318004094644], [3, 0.2132773501296108], [4, 0.21770628900170838], [5, 0.2220435731727633]], 5: [[2, 0.24158506265896904], [3, 0.17760388468121172], [4, 0.18279294131764684], [5, 0.1917496386757677]], 6: [[2, 0.21697223677862587], [3, 0.15338479734413427], [4, 0.17366692394358288], [5, 0.16193197736304285]], 7: [[2, 0.2011962666408952], [3, 0.1412926150132645], [4, 0.15261307055636883], [5, 0.14803935348320568]]})

这正是我想要的。

© www.soinside.com 2019 - 2024. All rights reserved.