Secure Evaluation of Mathematical Expressions in Strings: A Python Implementation Based on Pyparsing

Dec 04, 2025 · Programming · 10 views · 7.8

Keywords: Python | Mathematical Expression Evaluation | Pyparsing | Secure Parsing | String Processing

Abstract: This paper explores effective methods for securely evaluating mathematical expressions stored as strings in Python. Addressing the security risks of using int() or eval() directly, it focuses on the NumericStringParser implementation based on the Pyparsing library. The article details the parser's grammar definition, operator mapping, and recursive evaluation mechanism, demonstrating support for arithmetic expressions and built-in functions through examples. It also compares alternative approaches using the ast module and discusses security enhancements such as operation limits and result range controls. Finally, it summarizes core principles and practical recommendations for developing secure mathematical computation tools.

Background and Challenges

In Python programming, handling mathematical expressions in string form is a common requirement. For example, given the string "2^4", the expected result is the numerical value 16. Direct use of int("2^4") raises a ValueError: invalid literal for int() with base 10: '2^4' error, as the int() function only converts pure numeric strings. While the eval() function can execute string code and return results, its security is concerning: malicious inputs like "__import__('os').remove('important file')" may lead to arbitrary command execution, or expressions like "9**9**9**9**9**9**9**9" could exhaust computational resources. Therefore, developing a secure and controllable method for expression evaluation is crucial.

Solution with Pyparsing Library

Pyparsing is a powerful parsing library suitable for building custom syntax analyzers. Based on its example fourFn.py, a NumericStringParser class can be encapsulated to parse and evaluate mathematical expressions. This approach enhances security by defining strict grammar rules and operator mappings, preventing the execution of arbitrary code.

Grammar Definition and Parsing Structure

The core of NumericStringParser involves using Pyparsing components to define the grammar of mathematical expressions. The grammar rules are based on context-free grammar, including:

For instance, numbers are defined via Combine and Word to support scientific notation; function identifiers consist of letters, digits, and specific characters. This design ensures only predefined mathematical elements are parsed, excluding unsafe code.

Operator and Function Mapping

The parser maintains two mapping dictionaries internally:

This mapping mechanism restricts executable operations, preventing users from injecting custom functions or dangerous calls.

Recursive Evaluation Algorithm

The evaluation process is implemented via the evaluateStack method, using a stack-based recursive approach:

  1. Pop elements from the expression stack.
  2. If an operator, recursively evaluate operands and apply the mapped function.
  3. If a constant or function, return the corresponding value or call the function.
  4. The final result is returned through the eval method, supporting floating-point output.

For example, the expression "2^4" is parsed into the stack [2, 4, '^'], yielding 16.0 after evaluation. This process isolates expression logic from the Python execution environment, enhancing security.

Usage Examples and Performance Analysis

After instantiating NumericStringParser, string expressions can be evaluated directly:

nsp = NumericStringParser()
result = nsp.eval('2^4')
print(result)  # Output: 16.0
result = nsp.eval('exp(2^4)')
print(result)  # Output: 8886110.520507872

The parser supports complex expressions like "1 + 2*3^(4^5) / (6 + -7)", correctly handling precedence and parentheses. Performance-wise, Pyparsing parsing overhead is higher than direct eval, but through pre-compiled grammar and optimized stack operations, it meets most application scenarios. Security tests show that malicious inputs like "__import__('os').remove('file')" are parsed as invalid identifiers or raise exceptions, avoiding code execution.

Alternative Approaches and Security Enhancements

Beyond Pyparsing, the ast module offers another secure evaluation method. By parsing the Abstract Syntax Tree (AST) and customizing evaluation functions to limit operation types:

import ast
import operator as op
operators = {ast.Add: op.add, ast.Sub: op.sub, ast.Mult: op.mul,
             ast.Div: op.truediv, ast.Pow: op.pow}
def eval_expr(expr):
    node = ast.parse(expr, mode='eval').body
    # Recursively evaluate AST nodes, allowing only predefined operations
    return eval_node(node)

This method also avoids the security risks of eval but requires handling more node types. To enhance security, additional measures can be implemented:

These measures are applicable in both Pyparsing and AST approaches, further improving system robustness.

Conclusion and Best Practices

Securely evaluating mathematical expressions in strings requires balancing functionality and risk. The NumericStringParser based on Pyparsing provides a structured solution, effectively defending against code injection attacks through strict grammar definitions and restricted operation mappings. Key practices include:

  1. Avoid using eval: Prefer parser-based solutions unless in fully controlled environments.
  2. Define clear grammar: Limit expression elements to mathematical constructs, excluding potentially dangerous structures.
  3. Implement runtime checks: Add parameter validation and result monitoring to prevent abuse.
  4. Consider performance and scalability: For high-performance needs, optimize parsing logic or cache results.

By combining Pyparsing's flexibility with custom security strategies, developers can build reliable and secure mathematical expression evaluation tools, suitable for applications in educational software, calculator apps, or data analysis systems.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.