sklearn.preprocessing.scale如何工作？

Question

我有一个二维数据数组。描述一个人的收入和他的年龄在准备模型之前，先对数据进行缩放。缩放数据后，它会将先前数据的值完全更改为一些新值（接近0）。

from numpy import random, array

#Create fake income/age clusters for N people in k clusters
def createClusteredData(N, k):
    random.seed(10)
    pointsPerCluster = float(N)/k
    X = []
    for i in range (k):
        incomeCentroid = random.uniform(20000.0, 200000.0)
        ageCentroid = random.uniform(20.0, 70.0)
        for j in range(int(pointsPerCluster)):
            X.append([random.normal(incomeCentroid, 10000.0), random.normal(ageCentroid, 2.0)])
    X = array(X)
    return X



%matplotlib inline

from sklearn.cluster import KMeans

import matplotlib.pyplot as plt
from sklearn.preprocessing import scale
from numpy import random, float

data = createClusteredData(100, 5)

model = KMeans(n_clusters=5)

model = model.fit(scale(data))```


what does that scale actually do in that model? I came to know it brings data to the same level or at comparison state. But what Mathematical functions it performs in data. I have referenced the documentation of sci-kit learn but couldn't get what it means.  Please explain the operations performed in simple language

Answer 1

它是从每个观察值的数据中除去平均值，然后除以数据的标准偏差

查看参数以获取更多详细信息

with_mean boolean, True by default
   If True, center the data before scaling.

with_std boolean, True by default
   If True, scale the data to unit variance (or equivalently, unit standard deviation).

sklearn.preprocessing.scale如何工作？

问题描述投票：0回答：1

1个回答

最新问题

sklearn.preprocessing.scale如何工作？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1