sklearn.preprocessing.scale如何工作?

问题描述 投票:0回答:1

我有一个二维数据数组。描述一个人的收入和他的年龄在准备模型之前,先对数据进行缩放。缩放数据后,它会将先前数据的值完全更改为一些新值(接近0)。

from numpy import random, array

#Create fake income/age clusters for N people in k clusters
def createClusteredData(N, k):
    random.seed(10)
    pointsPerCluster = float(N)/k
    X = []
    for i in range (k):
        incomeCentroid = random.uniform(20000.0, 200000.0)
        ageCentroid = random.uniform(20.0, 70.0)
        for j in range(int(pointsPerCluster)):
            X.append([random.normal(incomeCentroid, 10000.0), random.normal(ageCentroid, 2.0)])
    X = array(X)
    return X



%matplotlib inline

from sklearn.cluster import KMeans

import matplotlib.pyplot as plt
from sklearn.preprocessing import scale
from numpy import random, float

data = createClusteredData(100, 5)

model = KMeans(n_clusters=5)

model = model.fit(scale(data))```


what does that scale actually do in that model? I came to know it brings data to the same level or at comparison state. But what Mathematical functions it performs in data. I have referenced the documentation of sci-kit learn but couldn't get what it means.  Please explain the operations performed in simple language
python scikit-learn k-means
1个回答
0
投票

它是从每个观察值的数据中除去平均值,然后除以数据的标准偏差

查看参数以获取更多详细信息

with_mean boolean, True by default
   If True, center the data before scaling.

with_std boolean, True by default
   If True, scale the data to unit variance (or equivalently, unit standard deviation).
© www.soinside.com 2019 - 2024. All rights reserved.