Comparative Analysis and Optimization of Prime Number Generation Algorithms

Keywords: Prime Generation | Algorithm Optimization | Python Performance | Sieve Methods | NumPy Acceleration

Abstract: This paper provides an in-depth exploration of various efficient algorithms for generating prime numbers below N in Python, including the Sieve of Eratosthenes, Sieve of Atkin, wheel sieve, and their optimized variants. Through detailed code analysis and performance comparisons, it demonstrates the trade-offs in time and space complexity among different approaches, offering practical guidance for algorithm selection in real-world applications. Special attention is given to pure Python implementations versus NumPy-accelerated solutions.

Introduction

Prime number generation is a classical problem in computer science and mathematics, with applications in cryptography, number theory, and algorithm design. Based on a popular Stack Overflow discussion, this paper systematically compares multiple efficient prime generation algorithms, focusing on their core concepts, implementation details, and performance characteristics.

Initial Algorithm and Its Limitations

The user's initial algorithm utilizes set operations to implement a sieve:

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

While functionally correct, this approach suffers from two main issues: first, the pop() method of sets does not guarantee returning the smallest element, potentially leading to suboptimal sieving order; second, set operations have high time complexity, particularly difference_update with large datasets.

Detailed Analysis of Optimized Algorithms

Standard Sieve of Eratosthenes

The fundamental optimization replaces sets with boolean arrays:

def sieve_of_eratosthenes(n):
    if n <= 2:
        return []
    sieve = [True] * n
    sieve[0] = sieve[1] = False
    for i in range(2, int(n**0.5)+1):
        if sieve[i]:
            sieve[i*i:n:i] = [False] * len(sieve[i*i:n:i])
    return [i for i in range(2, n) if sieve[i]]

This version directly marks multiples, avoiding set operations, with time complexity O(n log log n) and space complexity O(n).

Odd-Only Optimized Sieve

Further optimization processes only odd numbers:

def rwh_primes1(n):
    """Returns a list of primes < n"""
    if n < 2:
        return []
    if n == 2:
        return [2]
    sieve = [True] * (n//2)
    for i in range(3, int(n**0.5)+1, 2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in range(1, n//2) if sieve[i]]

This variant halves memory usage while maintaining the same time complexity.

Wheel Sieve Optimization

The wheel sieve based on modulo 30 further reduces redundant computations:

def sieve_wheel_30(N):
    """Returns primes <= N using 2×3×5=30 wheel criterion"""
    if N < 2:
        return []
    if N <= 30:
        small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
        return [p for p in small_primes if p <= N]
    
    # Initialize wheel offsets
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    primes = [2, 3, 5]
    
    # Create sieve arrays for each residue class
    sieve_arrays = {}
    for offset in offsets:
        sieve_arrays[offset] = [True] * (N//30 + 1)
    
    # Sieving logic (simplified)
    pos = 0
    while True:
        current_pos = 30 * pos
        if current_pos + 7 > N:
            break
        
        # Check each offset class and mark multiples
        for offset in offsets:
            num = current_pos + offset
            if num > N:
                continue
            if sieve_arrays[offset][pos]:
                primes.append(num)
                # Mark all multiples
                multiple = num * num
                while multiple <= N:
                    multiple_offset = multiple % 30
                    if multiple_offset in offsets:
                        sieve_arrays[multiple_offset][multiple//30] = False
                    multiple += num
        pos += 1
    
    return primes

Performance Comparison Analysis

Testing with n=1,000,000 reveals the following performance characteristics:

Pure Python Implementations (without psyco)

rwh_primes2: 68.1 ms
rwh_primes1: 93.7 ms
sieve_wheel_30: 97.4 ms
Standard Sieve of Eratosthenes: 178.0 ms

NumPy-Accelerated Implementations

NumPy leverages vectorized operations for significant performance gains:

import numpy as np

def primesfrom2to(n):
    """Input n>=6, Returns an array of primes, 2 <= p < n"""
    sieve = np.ones(n//3 + (n%6==2), dtype=bool)
    sieve[0] = False
    for i in range(1, int(n**0.5)//3+1):
        if sieve[i]:
            k = 3*i+1 | 1
            sieve[k*k//3::2*k] = False
            sieve[k*(k-2*(i&1)+4)//3::2*k] = False
    return np.r_[2, 3, ((3*np.nonzero(sieve)[0][1:]+1) | 1)]

Benchmarks show primesfrom2to requires only 15.9 ms, over 4 times faster than the best pure Python implementation.

Algorithm Selection Guidelines

Choose algorithms based on specific requirements:

Educational and Simple Applications: Standard Sieve of Eratosthenes for clarity and understandability
Performance-Critical Scenarios: rwh_primes1 or rwh_primes2 for balanced performance and readability
Maximum Performance: NumPy's primesfrom2to, though it requires additional dependencies
Memory-Constrained Environments: Odd-only variants or wheel sieve to reduce memory footprint

Advanced Optimization Techniques

Reference articles mention advanced algorithms like the Deléglise-Dusart-Roblot method achieving O(x^{2/3}/\log^2 x) time complexity by combining modular arithmetic with the Prime Number Theorem and Chinese Remainder Theorem to compute prime sums modulo values. While theoretically superior, these algorithms are complex to implement and more suitable for ultra-large-scale computations.

Practical Implementation Considerations

When implementing prime generation algorithms, consider:

Boundary condition handling, especially for small input values
Memory management to avoid creating excessively large arrays
Numerical overflow prevention, particularly in multiplication operations
Caching mechanisms for optimizing repeated computations

Conclusion

Selecting prime generation algorithms requires balancing performance, memory usage, and code complexity. For most applications, optimized variants of the Sieve of Eratosthenes provide the best compromise. When performance is paramount and the environment permits, NumPy-based implementations are preferred. Understanding the core concepts and optimization techniques of different algorithms enables informed decision-making in practical projects.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.