概要描述

在逻辑回归中,可以使用梯度下降算法来求解模型(即确定sigmoid函数中的各系数和截距)。
本文用python实现梯度下降算法,并使用numpy做矩阵计算求解逻辑回归模型,最后与sklearn官方代码结果验证结果正确性。

详细说明

python实现梯度下降算法

# !/usr/bin/python
# -*- coding: utf-8 -*-
"""
PROJECT_NAME = Datawhale
Author : sciengineer
Email : 821072960@qq.com
Time = 2021/7/22 18:09
"""


import numpy as np

# General a toy dataset:s it's just a straight line with some Gaussian noise:
xmin, xmax = -5, 5
n_samples = 100
np.random.seed(0)
X = np.random.normal(size=n_samples)
y = (X > 0).astype(np.int64)
X[X > 0] *= 4
X += .3 * np.random.normal(size=n_samples)
X = X[:, np.newaxis]


# learning rate: lr
lr = 1


# gradient descent algorithm
def grad_desc(x_ndarr, y_ndarr):


    """
    :param x_ndarr: ndarray of features (for example, 422 samples with 3 fetures, then the shape of x_ndarr
    is (422,3)
    :param y_ndarr: ndarray of label (for example, 422 samples with 1 label, then the shape of y_ndarr
    is (422,)
    :return: a list contains interception and coeffients of the logistic regression model
    """
    x0 = np.ones(x_ndarr.shape[0])
    # in order to use matrix multiplication, add this column x0
    x_ndarr = np.insert(x_ndarr, 0, x0, axis=1)
    theta_arr = np.zeros(x_ndarr.shape[1])

    # to find a reasonable number to initialize the j_loss_last, I
    # calculate the j_loss when i == 0 in the for loop (69), and
    # set j_loss_last the  same order of magnitude with j_loss.
    j_loss_last = 1e2
    delta = 1e-10

    for i in range(10 ** 10):

        # z = theta0 * x0 + theta1 * x1 +theta2 * x2+ ... +thetan * xn
        # z = np.dot(x_ndarr, theta_arr)
        z = np.dot(x_ndarr, theta_arr)
        # z = np.dot( theta_arr, x_ndarr.T)
        #
        y_hat = 1/(1+np.exp(-z))
        # use the clip to bound the y_hat in (0,1) , and to avoid warning "divide by zero encountered in log" or
        # "invalid value encountered in log". Otherwise this will ruin the optimization.
        y_hat = np.clip(y_hat,delta,1-delta)
        # j_loss = np.dot((y_hat - y_ndarr).T, (y_hat - y_ndarr))
        j_loss = -np.dot(y_ndarr , np.log(y_hat))  - np.dot((1-y_ndarr),np.log(1-y_hat))
        delta_j_loss = j_loss_last - j_loss
        rate = abs(delta_j_loss / j_loss)
        # partial derivative of function j_loss with respect to variable theta_arr
        pd_j2theta_arr = np.dot(y_hat - y_ndarr, x_ndarr)

        # theta_arr updates each interation
        theta_arr = theta_arr - lr * 0.01*pd_j2theta_arr
        j_loss_last = j_loss

        # I choose the rate as the condition of convergence

        if rate < 5 * 1e-10:
            break
    return theta_arr



theta_arr = grad_desc(X, y)
# The coeffients: theta1, theta2,..., thetan
print('Coeffients: \n', theta_arr[1:])
# The interception: theta0
print('Interception: \n', theta_arr[0])

运行结果:

Coeffients: 
 [6.90478134]
Interception: 
 -1.6481480918181262

sklearn官方demo

只选取这个官方demo中的前面部分,主要是为了验证手写的逻辑回归算法实现的正确性。

# !/usr/bin/python
# -*- coding: utf-8 -*-
# Code source: Gael Varoquaux
# License: BSD 3 clause

import numpy as np
import matplotlib.pyplot as plt

from sklearn import linear_model
from scipy.special import expit

# General a toy dataset:s it's just a straight line with some Gaussian noise:
xmin, xmax = -5, 5
n_samples = 100
np.random.seed(0)
X = np.random.normal(size=n_samples)
y = (X > 0).astype(float)
X[X > 0] *= 4
X += .3 * np.random.normal(size=n_samples)

X = X[:, np.newaxis]

# Fit the classifier
clf = linear_model.LogisticRegression(C=1e10)
clf.fit(X, y)

print('Coeffients: \n', clf.coef_)
print('Interception: \n',clf.intercept_)

运行结果:

Coeffients: 
 [[6.90879439]]
Interception: 
 [-1.64913083]

可以看到,使用python实现的逻辑回归与sklearn库的运行结果一致。当然,这个实现肯定有许多需要优化的地方,欢迎讨论。

Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐