Java Parallel Stream
Parallel streaming is a tool introduced by Java 8 to efficiently process collection data, and accelerate calculations through multithreading.
The following is a detailed guide to its core concepts, usage methods and precautions:
1. Core concepts and principles
-
Parallel processing mechanism: Split the data into multiple blocks, using
Fork/Join
The framework processes it in parallel on multiple threads and finally merges the results. -
Default thread pool:use
()
, the number of threads is equal to the number of CPU cores (can be adjusted through system parameters). - Applicable scenarios: Large-scale data sets, computing-intensive tasks (such as mathematical operations, batch conversion).
2. How to create parallel streams
-
Generate directly: Through the collection
parallelStream()
method. -
Conversion order stream: Called on an existing stream
parallel()
。
List<Integer> list = (1, 2, 3, 4); // Method 1: Directly generate parallel streamsStream<Integer> parallelStream1 = (); // Method 2: Transfer sequentially to parallelStream<Integer> parallelStream2 = ().parallel();
3. Applicable scenarios and performance optimization
Recommended scenarios:
- Large amount of data: Such as filtering and mapping of million-level elements.
- Complicated complexity: Such as matrix operation and image processing.
-
Stateless operation:like
map
、filter
、reduce
(Not relying on processing order or external variables).
Performance Trap:
- Small dataset: Parallelization overhead (thread scheduling, data segmentation) may offset the benefits.
- Low time-consuming operation: If simple addition and subtraction, parallelism may be slower.
4. Precautions and best practices
Avoid sharing of variable states
Modifying shared variables in parallel operations will cause thread safety issues, and stateless operations or synchronization control should be used.
// Error example: thread unsafe accumulationList<Integer> nums = (1, 2, 3); int[] sum = {0}; ().forEach(n -> sum += n); // The result may be wrong // Correct way: Use reductionint safeSum = ().reduce(0, Integer::sum);
Use stateful operations with caution
likesorted()
、distinct()
It may be more time consuming in parallel streams and requires merging thread results.
// Parallel sorting (maybe slower than sequential flow)List<Integer> sortedList = ().sorted().toList();
Data source splitability
-
High-efficiency structure:
ArrayList
, array (supports fast random access and is easy to segment). -
Inefficient structure:
LinkedList
、TreeSet
(The splitting cost is high).
Sequentially sensitive operations
useforEachOrdered
Sequence is guaranteed, but performance is sacrificed.
// Output in sequence (performance is lower than unordered operation)().forEachOrdered(::println);
Configure thread pool
Default number of threads:
().availableProcessors()
Modify the global number of threads:
# JVM startup parameters-=8
5. Performance comparison example
// Sequential stream vs parallel stream (processing 10 million data)List<Long> numbers = (1, 10_000_000) .boxed().collect(()); // Sequential flow timelong start = (); long seqSum = ().mapToLong(n -> n * 2).sum(); ("Sequential flow takes time: " + (() - start) + "ms"); // Parallel flow timestart = (); long parSum = ().mapToLong(n -> n * 2).sum(); ("Parallel flow takes time: " + (() - start) + "ms");
Typical results(8 core CPU):
Sequential streaming time: 120ms Parallel streaming time: 35ms
Summarize
Advantages: Simplify multi-threaded programming and improve the efficiency of big data processing.
Limited: Not suitable for small data volume, sequential sensitive or low computational volume tasks.
Best Practices:
- Prioritize large-scale data processing.
- Avoid operating with shared variables.
- Test verification performance improvement.
- use
forEach
AlternativeforEachOrdered
Unless order must be guaranteed.
By using parallel streams reasonably, program performance can be significantly improved without adding complex code, but the pros and cons need to be weighed in combination with scenario trade-offs.
The above is personal experience. I hope you can give you a reference and I hope you can support me more.