Keywords: Java Regular Expressions | Matcher.find() | Stream API
Abstract: This paper provides an in-depth examination of multiple approaches to convert regular expression matches into arrays in Java. It covers traditional iterative methods using Matcher.find(), Stream API solutions introduced in Java 9, and advanced custom iterator implementations. Complete code examples and performance comparisons offer comprehensive technical guidance for developers.
Fundamental Principles of Regex Matching
In Java programming, regular expressions serve as powerful tools for string pattern matching. The Pattern and Matcher classes form the core components of Java's regex API, where Pattern compiles regular expressions and Matcher executes matching operations.
Traditional Iterative Matching Approach
The iterative method based on Matcher.find() represents the standard solution for Java 8 and earlier versions. This approach iterates through all matches by repeatedly calling the find() method and collecting results into a collection.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.ArrayList;
import java.util.List;
public class RegexMatcher {
public static String[] getAllMatches(String regex, String input) {
List<String> allMatches = new ArrayList<String>();
Matcher matcher = Pattern.compile(regex).matcher(input);
while (matcher.find()) {
allMatches.add(matcher.group());
}
return allMatches.toArray(new String[0]);
}
}In the above code, the matcher.find() method locates the next match with each invocation, while matcher.group() returns the complete matched string. This method offers excellent compatibility across all Java versions.
Java 9 Stream API Optimization
Java 9 introduced the Matcher.results() method, which returns a Stream<MatchResult> and leverages functional programming advantages.
import java.util.regex.Pattern;
import java.util.regex.MatchResult;
import java.util.stream.Collectors;
public class AdvancedRegexMatcher {
public static String[] getMatchesWithStream(String regex, String input) {
return Pattern.compile(regex)
.matcher(input)
.results()
.map(MatchResult::group)
.toArray(String[]::new);
}
public static List<String> getMatchesAsList(String regex, String input) {
return Pattern.compile(regex)
.matcher(input)
.results()
.map(MatchResult::group)
.collect(Collectors.toList());
}
}The Stream API approach provides more concise code and supports chained operations, facilitating subsequent data processing.
Advanced Custom Iterator Implementation
For scenarios requiring fine-grained control over the matching process, custom Iterable interfaces can be implemented to provide lazy evaluation of matches.
import java.util.regex.Pattern;
import java.util.regex.MatchResult;
import java.util.Iterator;
import java.util.NoSuchElementException;
public class RegexIterable {
public static Iterable<MatchResult> allMatches(final Pattern pattern, final CharSequence input) {
return new Iterable<MatchResult>() {
public Iterator<MatchResult> iterator() {
return new Iterator<MatchResult>() {
private final Matcher matcher = pattern.matcher(input);
private MatchResult pendingMatch;
public boolean hasNext() {
if (pendingMatch == null && matcher.find()) {
pendingMatch = matcher.toMatchResult();
}
return pendingMatch != null;
}
public MatchResult next() {
if (!hasNext()) {
throw new NoSuchElementException();
}
MatchResult current = pendingMatch;
pendingMatch = null;
return current;
}
public void remove() {
throw new UnsupportedOperationException();
}
};
}
};
}
}Usage example demonstrates how to iterate through match results and access detailed information:
for (MatchResult match : RegexIterable.allMatches(Pattern.compile("[abc]"), "abracadabra")) {
System.out.println(match.group() + " at position " + match.start());
}Performance Analysis and Best Practices
The traditional iterative approach demonstrates conservative memory usage, making it suitable for processing large texts. The Stream API solution excels in code conciseness and readability but requires attention to Java version compatibility. Custom iterators provide maximum flexibility, particularly suited for scenarios requiring early termination of match searches.
In practical development, selection should be based on specific requirements: use Stream API for simple matching needs, traditional iteration for performance-sensitive scenarios, and custom iterators for complex control requirements.