Keywords: JVM Garbage Collection | UseParallelGC | UseParNewGC | Parallel Collection Algorithms | Young Generation Collection
Abstract: This paper provides a comprehensive comparison of two parallel young generation garbage collection algorithms in Java Virtual Machine: -XX:+UseParallelGC and -XX:+UseParNewGC. By examining the implementation mechanisms of original copying collector, parallel copying collector, and parallel scavenge collector, the analysis focuses on their performance in multi-CPU environments, compatibility with old generation collectors, and adaptive tuning capabilities. The paper explains how UseParNewGC cooperates with Concurrent Mark-Sweep collector while UseParallelGC optimizes for large heaps and supports JVM ergonomics.
Introduction
In Java Virtual Machine (JVM) garbage collection mechanisms, the choice of young generation collection algorithms significantly impacts application performance. Based on authoritative technical resources and practical tuning experience, this paper deeply analyzes two parallel young generation garbage collection algorithms: the parallel copying collector enabled by -XX:+UseParNewGC and the parallel scavenge collector enabled by -XX:+UseParallelGC. While both algorithms implement multi-threaded parallel collection on the surface, they exhibit substantial differences in design goals, implementation mechanisms, and applicable scenarios.
Fundamentals of Garbage Collection Algorithms
JVM garbage collectors typically employ generational collection strategies, dividing heap memory into young generation and old generation. The young generation primarily hosts newly created objects, characterized by short object lifecycles and high collection frequency. Traditional young generation collection uses copying algorithms, transferring live objects from one Survivor space to another while clearing the Eden space.
The default original copying collector operates in single-threaded mode. When the collector initiates, all application threads pause (Stop-the-World), and collection proceeds using only a single thread. Even on multi-CPU systems, this design fails to fully utilize hardware resources, resulting in inefficient collection.
Parallel Copying Collector (UseParNewGC)
The parallel copying collector enabled by -XX:+UseParNewGC represents an improved version of the original copying collector. This collector also employs Stop-the-World mode but significantly enhances young generation collection efficiency through multi-threaded parallelization of the copying process.
Technically, this algorithm partitions young generation memory into multiple regions, with each collection thread responsible for processing one or more regions. When collection begins, all application threads pause, and multiple collection threads work simultaneously to copy live objects from From Survivor space to To Survivor space. Theoretically, on multi-CPU systems, collection speed can improve by a factor of N (where N is the number of available CPUs) compared to single-threaded collection.
A critical characteristic is that -XX:+UseParNewGC automatically activates when the Concurrent Mark-Sweep collector (CMS) is enabled for old generation collection. This occurs because the CMS collector requires specific young generation collector compatibility, which the parallel copying collector provides. Code example:
// JVM startup parameter example
java -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmx2g MyApplication
// Monitoring GC status in code
MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
List<MemoryPoolMXBean> pools = ManagementFactory.getMemoryPoolMXBeans();
for (MemoryPoolMXBean pool : pools) {
System.out.println("Pool name: " + pool.getName() + ", Type: " + pool.getType());
}
Parallel Scavenge Collector (UseParallelGC)
The parallel scavenge collector enabled by -XX:+UseParallelGC represents another parallel young generation collection algorithm, but with fundamentally different design objectives and technical implementations compared to the parallel copying collector.
This algorithm specifically optimizes for large memory heaps (typically exceeding 10GB) and multi-CPU environments, with primary design goals of maximizing throughput while minimizing pause times. Unlike the parallel copying collector, the parallel scavenge collector employs more sophisticated algorithms to balance CPU utilization and memory access patterns.
An important feature is support for JVM's adaptive tuning strategy (Ergonomics). The JVM can automatically adjust heap space sizes, Survivor space ratios, and other parameters based on runtime data to achieve optimal performance. This adaptive capability proves particularly valuable in long-running server applications.
However, this collector has a significant limitation: it can only cooperate with the original mark-sweep collector for old generation collection, and cannot work concurrently with the Concurrent Mark-Sweep collector (CMS) or G1 collector. This necessitates careful consideration of application requirements when selecting collector combinations.
// Enabling parallel scavenge collector with adaptive parameters
java -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -Xmx12g MyApplication
// Monitoring adaptive tuning effects
List<GarbageCollectorMXBean> gcBeans = ManagementFactory.getGarbageCollectorMXBeans();
for (GarbageCollectorMXBean gcBean : gcBeans) {
System.out.println("GC Name: " + gcBean.getName() +
", Collections: " + gcBean.getCollectionCount() +
", Time: " + gcBean.getCollectionTime() + "ms");
}
Core Differences Comparison
Through comparative analysis, we can summarize the key differences between the two algorithms:
- Design Objective Differences: UseParNewGC primarily optimizes copying collection efficiency in multi-CPU environments, while UseParallelGC optimizes for large memory heaps, balancing throughput and pause times.
- Compatibility Differences: UseParNewGC can cooperate with the Concurrent Mark-Sweep collector (CMS), while UseParallelGC can only work with the original mark-sweep collector.
- Adaptive Capabilities: UseParallelGC supports JVM adaptive tuning strategies, automatically adjusting parameters based on runtime conditions; UseParNewGC lacks this capability.
- Applicable Scenarios: UseParNewGC suits scenarios requiring low latency and CMS compatibility; UseParallelGC suits server applications with large memory and high throughput requirements.
Performance Tuning Recommendations
In practical applications, selecting which young generation collection algorithm requires consideration of multiple factors:
- Hardware Configuration: On multi-CPU systems, both parallel algorithms provide better performance than single-threaded collectors. However, on single-CPU systems, parallelization may introduce additional overhead.
- Memory Scale: For large memory heaps exceeding 10GB, UseParallelGC's optimized algorithms may prove more effective.
- Latency Requirements: If applications are sensitive to pause times and require CMS compatibility, UseParNewGC represents a more suitable choice.
- Throughput Priority: If applications prioritize maximum throughput and can tolerate longer young generation collection pauses, UseParallelGC may be superior.
The following performance testing framework example compares effects of different collector combinations:
public class GCPerformanceTest {
private static final int OBJECT_COUNT = 1000000;
private static final int LOOP_COUNT = 100;
public static void main(String[] args) {
List<Object> objectPool = new ArrayList<>();
long totalTime = 0;
for (int i = 0; i < LOOP_COUNT; i++) {
long startTime = System.currentTimeMillis();
// Creating numerous temporary objects
for (int j = 0; j < OBJECT_COUNT; j++) {
objectPool.add(new byte[1024]); // 1KB objects
}
// Clearing some objects, simulating object death
for (int j = 0; j < OBJECT_COUNT / 2; j++) {
objectPool.set(j, null);
}
// Triggering young generation GC
System.gc();
long endTime = System.currentTimeMillis();
totalTime += (endTime - startTime);
// Cleaning object pool
objectPool.clear();
}
System.out.println("Average time per iteration: " +
(totalTime / LOOP_COUNT) + "ms");
}
}
Conclusion
UseParNewGC and UseParallelGC represent two different design philosophies for JVM young generation parallel collection. UseParNewGC, through tight integration with the CMS collector, provides an effective young generation collection solution for applications requiring low latency. UseParallelGC, through optimization for large memory heaps and adaptive tuning capabilities, offers powerful performance support for high-throughput applications.
In practical applications, developers should comprehensively consider the characteristics of these two algorithms based on specific hardware environments, memory scales, performance requirements, and old generation collector choices. Proper collector selection combined with reasonable parameter tuning can significantly enhance Java application performance and stability. As JVM technology continues evolving, new collectors like G1 and ZGC provide additional options, but understanding the principles and differences of these traditional algorithms remains fundamental for effective performance tuning.