概要描述

在线性回归中,可以使用最小二乘法或梯度下降算法来求解线性模型(即确定线性函数中的各系数和截距)。
本文用python实现梯度下降算法,并使用numpy做矩阵计算,最后与sklearn官方代码结果验证结果正确性。

详细说明

python实现梯度下降算法

# !/usr/bin/python
# -*- coding: utf-8 -*-
"""
PROJECT_NAME = Datawhale
Author : sciengineer
Email : 821072960@qq.com
Time = 2021/7/15 13:36
"""
import numpy as np
from sklearn import datasets

# Load the diabetes dataset
diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

# Use a part of features (for example:feature 2~5)
diabetes_X = diabetes_X[:, 2:6]

# The training features
diabetes_X_train = diabetes_X[:-20]
# The training label
diabetes_y_train = diabetes_y[:-20]

# linear regression model : y_hat = theta0 + theta1 * x1 + theta2 * x2 + ... + thetan * xn

# features:x1, x2, ... , xn
x = diabetes_X_train
# label : y
y = diabetes_y_train

# learning rate: lr
lr = 0.001


# gradient descent algorithm
def grad_desc(x_ndarr, y_ndarr):


    """
    :param x_ndarr: ndarray of features (for example, 422 samples with 3 fetures, then the shape of x_ndarr
    is (422,3)
    :param y_ndarr: ndarray of label (for example, 422 samples with 1 label, then the shape of y_ndarr
    is (422,)
    :return: a list contains interception and coeffients of the linear regression model
    """
    x0 = np.ones(x_ndarr.shape[0])
    # in order to use matrix multiplication, add this column x0
    x_ndarr = np.insert(x_ndarr, 0, x0, axis=1)
    theta_arr = np.zeros(x_ndarr.shape[1])
    # to find a reasonable number to initialize the j_loss_last, I
    # calculate the j_loss when i == 0 in the for loop (12435769.0), and
    # set j_loss_last the  same order of magnitude with j_loss.
    j_loss_last = 1e7

    for i in range(10 ** 10):
        # y_hat = theta0 * x0 + theta1 * x1 +theta2 * x2+ ... +thetan * xn
        y_hat = np.dot(x_ndarr, theta_arr)
        j_loss = np.dot((y_hat - y_ndarr).T, (y_hat - y_ndarr))
        delta_j_loss = j_loss_last - j_loss
        rate = abs(delta_j_loss / j_loss)
        # partial derivative of function j_loss with respect to variable theta_arr
        pd_j2theta_arr = np.dot(y_hat - y_ndarr, x_ndarr)
        # theta_arr updates each interation
        theta_arr = theta_arr - lr * pd_j2theta_arr
        j_loss_last = j_loss
        # I choose the rate as the condition of convergence
        if rate < 1e-12:
            break
    return theta_arr


theta_arr = grad_desc(x, y)
# The coeffients: theta1, theta2,..., thetan
print('Coeffients: \n', theta_arr[1:])
# The interception: theta0
print('Interception: \n', theta_arr[0])

运行结果:

Coeffients: 
 [ 784.13395634  387.61068743  248.48665528 -217.23963173]
Interception: 
 152.88344455986112

sklearn官方demo

为了突出矩阵计算的优势,将官方demo中的一个feature,改成了3个feature。

#!/usr/bin/python
# -*- coding: utf-8 -*-

# Code source: Jaques Grobler
# License: BSD 3 clause
# https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html#sphx-glr-auto-examples-linear-model-plot-ols-py

from sklearn import datasets, linear_model

# Load the diabetes dataset
diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

# Use a part of features (for example:feature 2~5)
diabetes_X = diabetes_X[:, 2:6]

# The training features
diabetes_X_train = diabetes_X[:-20]
# The training label
diabetes_y_train = diabetes_y[:-20]

# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(diabetes_X_train, diabetes_y_train)

# The coefficients
print('Coefficients: \n', regr.coef_)
# The interception
print('Interception: \n', regr.intercept_)

运行结果:

Coefficients: 
 [ 784.14396199  387.59183622  248.69068916 -217.44205645]
Interception: 
 152.88344501815672

可以看到,使用python实现的梯度下降算法与sklearn库的运行结果一致。当然,这个实现肯定有许多需要优化的地方,欢迎讨论。

Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐