Keywords: Gaussian Distribution | Python Plotting | Data Visualization
Abstract: This article provides a comprehensive guide to plotting 1-dimensional Gaussian distribution functions using Python, focusing on techniques to visualize curves with different mean (μ) and standard deviation (σ) parameters. Starting from the mathematical definition of the Gaussian distribution, it systematically constructs complete plotting code, covering core concepts such as custom function implementation, parameter iteration, and graph optimization. The article contrasts manual calculation methods with alternative approaches using the scipy statistics library. Through concrete examples (μ, σ) = (−1, 1), (0, 2), (2, 3), it demonstrates how to generate clear multi-curve comparison plots, offering beginners a step-by-step tutorial from theory to practice.
Mathematical Foundations of the Gaussian Distribution Function
The 1-dimensional Gaussian distribution (also known as the normal distribution) is one of the most fundamental continuous probability distributions in probability theory and statistics. Its probability density function (PDF) is mathematically expressed as:
f(x) = (1 / (σ√(2π))) * exp(-(x - μ)² / (2σ²))
Here, μ represents the mean (determining the center of the curve), and σ denotes the standard deviation (controlling the width and steepness of the curve). This function describes the symmetric distribution of random variables around the mean and is widely applied in data modeling across natural sciences, engineering, and social sciences.
Python Environment Setup and Core Library Imports
To visualize Gaussian distributions in Python, two key scientific libraries are essential: numpy for numerical operations and array handling, and matplotlib for graphical plotting. Begin by importing the necessary modules with the following code:
import numpy as np
from matplotlib import pyplot as plt
The pyplot interface (commonly aliased as plt) simplifies plotting commands, while numpy provides efficient mathematical functions and linear algebra operations, ensuring both accuracy and speed in computations.
Implementation of a Custom Gaussian Distribution Function
Based on the mathematical formula above, we can define a Python function to compute the Gaussian probability density for any given x values. The following code illustrates how to translate the mathematical expression into executable program logic:
def gaussian(x, mu, sigma):
coefficient = 1.0 / (np.sqrt(2.0 * np.pi) * sigma)
exponent = -np.power((x - mu) / sigma, 2.0) / 2.0
return coefficient * np.exp(exponent)
This function accepts three parameters: x (an array of input values), mu (the mean), and sigma (the standard deviation). It first calculates the normalization coefficient to ensure the area under the curve equals 1, then processes the deviation from the mean through the exponential part. Using np.power and np.exp functions enables efficient array operations without explicit loops.
Data Preparation and Generation for Parametric Plotting
To visualize Gaussian curves under different parameters, it is necessary to generate a sequence of x values covering a reasonable range. The np.linspace function creates evenly spaced numerical points:
x_values = np.linspace(-10, 10, 1000)
Here, the interval from -10 to 10 with 1000 points ensures smooth curves and coverage of the main distribution area. For the parameter sets (μ, σ) = (−1, 1), (0, 2), (2, 3), sequential computation and plotting can be achieved through a loop structure:
parameters = [(-1, 1), (0, 2), (2, 3)]
for mu, sigma in parameters:
y_values = gaussian(x_values, mu, sigma)
plt.plot(x_values, y_values, label=f"μ={mu}, σ={sigma}")
In each iteration, the function computes the y-value array for the corresponding parameters and plots the curve using plt.plot. The label parameter adds a legend to distinguish between different curves.
Graph Optimization and Output
After basic plotting, a series of configuration commands can enhance the readability and aesthetics of the graph:
plt.xlabel("x values")
plt.ylabel("Probability Density f(x)")
plt.title("Comparison of 1-Dimensional Gaussian Distribution Functions")
plt.legend()
plt.grid(True, linestyle="--", alpha=0.5)
plt.show()
These settings add axis labels, a title, a legend, and grid lines, with the alpha parameter controlling grid transparency to avoid visual clutter. Finally, calling plt.show() opens a window displaying the graph, or users can save it as an image file using plt.savefig("gaussian_plot.png").
Alternative Approach: Using the scipy.stats Library
Beyond manual implementation, Python's scipy.stats module offers built-in normal distribution objects, simplifying code and improving computational efficiency. The following example demonstrates how to use this library to generate the same graph:
from scipy.stats import norm
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-10, 10, 1000)
distributions = [
norm(loc=-1, scale=1),
norm(loc=0, scale=2),
norm(loc=2, scale=3)
]
for dist in distributions:
plt.plot(x, dist.pdf(x), label=f"μ={dist.mean():.1f}, σ={dist.std():.1f}")
plt.legend()
plt.grid(True)
plt.show()
Here, the norm class directly encapsulates statistical properties of the Gaussian distribution, with loc and scale parameters corresponding to μ and σ, respectively. The pdf method automatically computes the probability density, bypassing manual formula derivation. This method is more suitable for scenarios requiring advanced statistical features (e.g., cumulative distribution, random sampling), but the custom function approach better facilitates understanding of underlying mathematical principles.
Summary of Core Concepts and Extension Suggestions
This article demonstrates the complete process of Gaussian distribution visualization through concrete examples, with key steps including: understanding the mathematical formula, defining the computation function, generating data points, iterating over multiple parameters, and optimizing graph output. For beginners, it is recommended to start with custom functions to solidify fundamentals before exploring advanced libraries like scipy. In extended applications, one can experiment with adjusting the x-range to observe curve variations or overlay more parameter combinations for comparative analysis. Additionally, fitting Gaussian distributions to real-world datasets (e.g., height, test scores) can further deepen understanding of statistical modeling.