一、定义

通过对原始数据进行变换把数据映射到(默认为[0,1])之间

二、公式

image.png

三、API

sklearn. preprocessing .MinMaxScaler (feature_range=(0, 1)…)

​ o MinMaxScalar .fit_ transform(X)

X: numpy array格式的数据[n_ samples, n_ features]

​ 返回值:转换后的形状相同的array

四、代码实例

	from sklearn.preprocessing import MinMaxScaler
	import pandas as pd 
	
	def minmax_demo():
	    #归一化
	    #1.获取数据
	    data = pd.read_csv("dating.txt")
	    data = data.iloc[:,:3]
	    print("data:\n",data)
	    #2.实例化一个转换器类
	    transfer = MinMaxScaler(feature_range=[0,1])
	    #3.调用fit_transform
	    data_new = transfer.fit_transform(data)
	    print("data_new:\n",data_new)
	    return None

五、运行结果

data:
     milage     Liters  Consumtime
0    40920   8.326976    0.953952
1    14488   7.153469    1.673904
2    26052   1.441871    0.895124
3    75136  13.147394    0.428964
4    38344   1.669788    0.134296
5    72993  18.141748    1.932955
6    35948   6.838792    1.213192
7    42666  13.276369    0.543888
8    67497   8.631577    0.749278
9    35483  12.273169    1.508953
10   50242   3.723498    0.831917
11   63275   8.385879    1.669485
12    5569   4.875435    0.728658
13   51052   4.688098    0.625224
14   77372  15.299570    0.331351
15   43673   1.889461    0.191283
16   61364   7.516754    1.269164
17   69673  14.239195    0.261333
18   15669   0.000000    1.259185
data_new:
 [[0.49233319 0.45899524 0.45570394]
 [0.12421487 0.3943098  0.85597548]
 [0.28526663 0.07947806 0.42299736]
 [0.96885924 0.72470382 0.1638265 ]
 [0.45645725 0.09204119 0.        ]
 [0.93901369 1.         1.        ]
 [0.42308817 0.37696434 0.59983354]
 [0.51664972 0.73181311 0.22772076]
 [0.86247093 0.4757853  0.34191139]
 [0.41661212 0.67651524 0.76426771]
 [0.62216063 0.20524472 0.38785618]
 [0.80367116 0.46224206 0.85351865]
 [0.         0.26874119 0.33044729]
 [0.6334415  0.2584149  0.27294112]
 [1.         0.84333494 0.10955662]
 [0.53067421 0.10414989 0.03168305]
 [0.77705667 0.41433461 0.63095228]
 [0.89277607 0.7848855  0.07062873]
 [0.14066265 0.         0.62540426]

六、总结

最大值最小值是变化的,最大值与最小值非常容易受异常点影响

所以这种方法鲁棒性较差,只适合传统精确小数据场景

Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐