Keywords: Regular Expressions | Negative Number Matching | Data Validation
Abstract: This paper provides an in-depth exploration of matching negative numbers in regular expressions. By analyzing the limitations of the original regex ^[0-9]\d*(\.\d+)?$, it details the solution of adding the -? quantifier to support negative number matching. The article includes comprehensive code examples and test cases that validate the effectiveness of the modified regex ^-?[0-9]\d*(\.\d+)?$, and discusses the exclusion mechanisms for common erroneous matching scenarios.
Fundamental Principles of Number Matching in Regular Expressions
In text processing and data validation, regular expressions serve as powerful tools for identifying specific patterns. Number matching, being a common requirement, demands regex designs that balance accuracy and flexibility. The original expression ^[0-9]\d*(\.\d+)?$ effectively matches positive integers and decimals, where ^ denotes the start of the string, [0-9] matches the first digit character, \d* matches zero or more subsequent digits, (\.\d+)? as an optional group matches the decimal point and following digit sequence, and $ ensures matching until the end of the string.
Analysis of Negative Number Matching Requirements
In practical applications, numeric fields often need to support negative values, such as expense amounts in financial data or thermometer readings. The original expression fails to recognize numbers starting with a minus sign, leading to erroneous rejection of valid inputs like -10 and -125.5. This limitation stems from the expression's focus solely on the numerical part, neglecting the sign bit.
Solution: Adding Optional Negative Sign Matching
By introducing the -? quantifier, the modified expression ^-?[0-9]\d*(\.\d+)?$ successfully extends the matching scope. Here, -? indicates that the minus sign occurs zero or one time, with ? as a quantifier ensuring this part is not matched in the case of positive numbers. This modification maintains compatibility with original positive numbers while adding support for negatives.
Code Implementation and Validation
The following JavaScript code demonstrates the practical application of the modified regular expression:
let regex = new RegExp(/^-?[0-9]\d*(\.\d+)?$/);
// Valid matching tests
console.log(regex.test('10')); // Output: true
console.log(regex.test('10.0')); // Output: true
console.log(regex.test('-10')); // Output: true
console.log(regex.test('-10.0')); // Output: true
// Exclusion of invalid matches
console.log(regex.test('--10')); // Output: false
console.log(regex.test('10-')); // Output: false
console.log(regex.test('1-0')); // Output: false
console.log(regex.test('10.-')); // Output: false
console.log(regex.test('10..0')); // Output: false
console.log(regex.test('10.0.1')); // Output: false
The test results show that the expression accurately distinguishes legal number formats from common error patterns, such as repeated signs, misplaced signs, or multiple decimal points.
Extension to Related Application Scenarios
Referencing other number matching needs, integer matching can use ^-?\d+$, while a more general number pattern (allowing empty integer parts) can be written as ^-?\d*\.?\d+$. In data cleaning scenarios, such as extracting amounts from text, care must be taken to preserve negative signs, commas, and decimal points to avoid excessive filtering that leads to information loss.
Summary and Best Practices
By simply adding the -? quantifier, support for negative numbers in regular expressions can be achieved, balancing code simplicity and functional completeness. In actual development, it is recommended to conduct boundary testing in conjunction with specific business logic to ensure the robustness of the expression. For complex number formats, the expression can be further extended to handle needs like scientific notation or thousand separators.