Keywords: Python | Normal Distribution | Data Visualization | matplotlib | scipy.stats
Abstract: This article provides a detailed tutorial on plotting normal distribution curves using Python's matplotlib and scipy.stats libraries. Starting from the fundamental concepts of normal distribution, it systematically explains how to set mean and variance parameters, generate appropriate x-axis ranges, compute probability density function values, and perform visualization with matplotlib. Through complete code examples and in-depth technical analysis, readers will master the core methods and best practices for plotting normal distribution curves.
Fundamental Concepts of Normal Distribution
The normal distribution, also known as Gaussian distribution, is one of the most important continuous probability distributions in statistics. Its probability density function exhibits a bell-shaped curve, completely determined by two parameters: the mean μ defines the center of the distribution, and the variance σ² determines the dispersion. In Python, we can utilize scientific computing libraries to efficiently plot and analyze normal distribution curves.
Importing and Configuring Core Libraries
Plotting normal distribution curves requires support from several key Python libraries. First, import the necessary modules:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import mathmatplotlib.pyplot is used for data visualization, numpy provides numerical computation support, scipy.stats contains rich statistical functions, and the math module handles mathematical operations. This combination offers a complete toolchain for plotting normal distributions.
Parameter Setting and Data Preparation
Setting the parameters of the normal distribution is the first step in plotting the curve. The following code demonstrates how to define the mean and variance:
mu = 0
variance = 1
sigma = math.sqrt(variance)Here, μ=0 indicates the distribution center is at the origin, and variance=1 corresponds to standard deviation σ=1. In practical applications, these parameter values can be adjusted according to specific requirements.
Generating Appropriate X-Axis Range
To fully display the bell-shaped characteristics of the normal distribution, it's necessary to generate x-axis data points covering the main probability region:
x = np.linspace(mu - 3*sigma, mu + 3*sigma, 100)Using numpy.linspace, 100 equally spaced points are generated within the range [μ-3σ, μ+3σ]. This range covers approximately 99.7% of the probability mass of the normal distribution, ensuring the main features of the curve are adequately displayed.
Computing Probability Density Function Values
Utilize the scipy.stats.norm.pdf function to compute the probability density corresponding to each x point:
y = stats.norm.pdf(x, mu, sigma)This function implements the standard normal distribution probability density formula: f(x) = (1/(σ√(2π))) * exp(-(x-μ)²/(2σ²)). Through vectorized computation, density values for all points can be obtained efficiently.
Visualization and Graph Display
Finally, use matplotlib to plot and display the curve:
plt.plot(x, y)
plt.xlabel('x values')
plt.ylabel('Probability Density')
plt.title('Normal Distribution Curve')
plt.grid(True)
plt.show()This code creates a complete graph including axis labels, title, and grid lines, making the distribution characteristics more clearly visible.
Parameter Adjustment and Customization
By modifying the mean and variance parameters, normal distribution curves with different characteristics can be plotted. For example, increasing the variance makes the curve flatter, while changing the mean shifts the entire curve. This flexibility makes the method suitable for various statistical analysis and data visualization scenarios.
Technical Summary
The key to plotting normal distribution curves lies in: reasonably setting the x-axis range to cover the main probability region, correctly using the probability density function to compute y-values, and appropriate graph beautification. The scipy.stats.norm.pdf function encapsulates complex mathematical calculations, greatly simplifying the implementation process. Meanwhile, matplotlib offers rich customization options to further optimize the visual effects of the graph.