Implementation and Optimization of Gradient Descent Using Python and NumPy

Nov 28, 2025 · Programming · 11 views · 7.8

Keywords: Gradient Descent | Python | NumPy | Linear Regression | Machine Learning

Abstract: This article provides an in-depth exploration of implementing gradient descent algorithms with Python and NumPy. By analyzing common errors in linear regression, it details the four key steps of gradient descent: hypothesis calculation, loss evaluation, gradient computation, and parameter update. The article includes complete code implementations covering data generation, feature scaling, and convergence monitoring, helping readers understand how to properly set learning rates and iteration counts for optimal model parameters.

Fundamentals of Gradient Descent Algorithm

Gradient descent is one of the most fundamental optimization algorithms in machine learning, widely used for parameter training in linear regression, logistic regression, and other models. The algorithm minimizes the loss function through iterative adjustments of model parameters, gradually approaching the optimal solution.

Core Algorithm Steps Analysis

The implementation of gradient descent can be decomposed into four key operational steps:

  1. Hypothesis Calculation: Compute predictions using current parameters with formula h = X * theta
  2. Loss Evaluation: Calculate the difference between predictions and actual values as loss = h - y, optionally computing squared cost (loss^2)/2m
  3. Gradient Computation: Compute gradient direction based on loss using formula gradient = X' * loss / m
  4. Parameter Update: Adjust parameters in the opposite direction of gradient with update rule theta = theta - alpha * gradient

Common Errors and Solutions

In the original code, main issues arose from dimension confusion and implementation complexity. Key errors included:

Optimized Implementation Code

Below is the optimized gradient descent implementation using vectorized operations for improved efficiency and readability:

import numpy as np
import random

def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        gradient = np.dot(xTrans, loss) / m
        theta = theta - alpha * gradient
    return theta

def genData(numPoints, bias, variance):
    x = np.zeros(shape=(numPoints, 2))
    y = np.zeros(shape=numPoints)
    for i in range(0, numPoints):
        x[i][0] = 1
        x[i][1] = i
        y[i] = (i + bias) + random.uniform(0, 1) * variance
    return x, y

x, y = genData(100, 25, 10)
m, n = np.shape(x)
numIterations = 100000
alpha = 0.0005
theta = np.ones(n)
theta = gradientDescent(x, y, theta, alpha, m, numIterations)
print(theta)

Algorithm Parameter Tuning

The performance of gradient descent algorithm largely depends on parameter settings:

Convergence Monitoring and Evaluation

Calculating and outputting cost value in each iteration is crucial:

Practical Application Recommendations

In practical applications, it's recommended to:

By correctly implementing gradient descent algorithm and properly tuning parameters, accurate model parameters can be obtained, laying solid foundation for more complex machine learning tasks.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.