一、定义

通过对原始数据进行变换把数据变换到均值为0,标准差为1范围內

二、公式

​ 示例:

三、API

sklearn. preprocessing .MinMaxScaler (feature_range=(0, 1)…)

o MinMaxScalar .fit_ transform(X)

X为 numpy array格式的数据[n_ samples, n_ features]

返回值:转换后的形状相同的array

四、代码实例

	from sklearn.preprocessing import StandardScaler
	import pandas as pd 
	
	def stand_demo():
    #标准化
    #归一化
    #1.获取数据
    data = pd.read_csv("dating.txt")
    data = data.iloc[:,:3]
    print("data:\n",data)
    #2.实例化一个转换器类
    transfer = StandardScaler()
    #3.调用fit_transform
    data_new = transfer.fit_transform(data)
    print("data_new:\n",data_new)
    return None

五、运行结果

data:
     milage     Liters  Consumtime
0    40920   8.326976    0.953952
1    14488   7.153469    1.673904
2    26052   1.441871    0.895124
3    75136  13.147394    0.428964
4    38344   1.669788    0.134296
5    72993  18.141748    1.932955
6    35948   6.838792    1.213192
7    42666  13.276369    0.543888
8    67497   8.631577    0.749278
9    35483  12.273169    1.508953
10   50242   3.723498    0.831917
11   63275   8.385879    1.669485
12    5569   4.875435    0.728658
13   51052   4.688098    0.625224
14   77372  15.299570    0.331351
15   43673   1.889461    0.191283
16   61364   7.516754    1.269164
17   69673  14.239195    0.261333
18   15669   0.000000    1.259185
data_new:
 [[ 0.12304713  0.24281169  0.08961701]
 [-1.09340804  0.00255574  1.51545904]
 [-0.5612089  -1.16679851 -0.02688998]
 [ 1.69773803  1.22971173 -0.95010503]
 [ 0.00449429 -1.12013631 -1.53368563]
 [ 1.59911275  2.25222225  2.02850131]
 [-0.10577457 -0.06186912  0.60303358]
 [ 0.20340165  1.2561172  -0.7225017 ]
 [ 1.34617549  0.30517366 -0.31573334]
 [-0.12717483  1.05072877  1.18877883]
 [ 0.5520648  -0.6996735  -0.15206943]
 [ 1.15187034  0.2548711   1.50670735]
 [-1.50387882 -0.46383365 -0.3565706 ]
 [ 0.58934267 -0.50218777 -0.56141834]
 [ 1.80064336  1.6703338  -1.14342447]
 [ 0.24974587 -1.07516193 -1.42082469]
 [ 1.06392218  0.07693228  0.71388434]
 [ 1.4463195   1.45323974 -1.28209289]
 [-1.03905598 -1.4619975   0.69412125]]

六、总结

在已有样本足够多的情况下比较稳定,适合现代嘈杂大数据场景。

Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐