How to use GroupBy in Java Stream
1. Preface
When processing collection data, we often need to group the data according to a specific condition. For example, in a student list, students may need to be classified by class, gender, or other attributes.
The Java Stream API provides powerful features to achieve this, wheregroup by
It is one of the most commonly used tools.
2. Basic concepts
What is GroupBy?
GroupByIs a data processing operation that divides elements in the data set into different groups according to specified conditions.
Elements in each group share a common attribute or satisfy a specific condition. This is very useful in data analysis, statistics and report generation.
GroupBy in Stream API
Java 8 has introducedStream API, it provides an efficient and concise way to process collection data.
group by
is part of the Stream API, allowing developers to easily group data and perform further operations on each group.
3. Basic usage
3.1 Grouping basis
In usegroup by
When you first need to determine what conditions to group.
This is usually a function that extracts a key value from each element (such as the value of a certain attribute) and divides the elements into different groups based on this key value.
Example: Grouped by class
Suppose we have a list of students:
List<Student> students = ( new Student("Alice", 20, "Class A"), new Student("Bob", 21, "Class B"), new Student("Charlie", 20, "Class A"), new Student("David", 22, "Class C") );
We want to group these students by class. Every student'sclassName
Attributes will be used as the basis for grouping.
3.2 Group by
In the Stream API, use()
Method to implement grouping operations.
This method requires aClassifier
Function, used to extract grouping keys from each element.
Sample code:
Map<String, List<Student>> groupedStudents = () .collect((student -> ()));
explain:
-
()
: Convert the student list to a Stream. -
.collect((...))
:use()
Methods are grouped. Inside brackets is a Lambda expression used to extract from each student objectclassName
As grouping key. -
Return value: Get one
Map<String, List<Student>>
, where the key is the class name (such as "Class A", "Class B", etc.), and the value is the list of students belonging to the class.
3.3 Operation after grouping
Once the data is grouped, various operations can be performed on each group, such as counting the number of elements in the group, calculating the average value, etc.
This usually passesCollectors
other methods to implement.
Example: Statistics of students by class
Map<String, Long> classCount = () .collect(( Student::getClassName, () ));
explain:
-
Student::getClassName
: Use method reference as grouping key extraction function. -
()
: Specifies the number of counted elements within each group.
result:
Get oneMap<String, Long>
, where the key is the class name and the value is the number of students in that class. For example:
{ "Class A": 2, "Class B": 1, "Class C": 1 }
4. Advanced usage
4.1 Custom grouping logic
In some cases, more complex grouping conditions may be required.
For example, in addition to grouping by class, students can also be grouped according to age range.
Example: Grouped by age interval
Suppose we want to group students by age group (such as “Under 20,” “20-22,” “Over 22”).
Map<String, List<Student>> ageGroupedStudents = () .collect((student -> { if (() < 20) { return "Under 20"; } else if (() <= 22) { return "20-22"; } else { return "Over 22"; } }));
explain:
- Lambda expressions: Defines a custom grouping logic that returns different interval strings according to the student's age.
-
result: Get one
Map<String, List<Student>>
, where the key is the age range and the value is the student grade list belonging to that range.
4.2 Multi-level grouping
Sometimes it is necessary to group according to multiple conditions.
For example, first group by class, then group by gender within each class. This can be done by nesting()
Method to implement.
Example: Grouped by class and gender
Map<String, Map<String, List<Student>>> groupedByClassAndGender = () .collect(( Student::getClassName, (student -> ()) ));
explain:
-
Outer layer
groupingBy
: Grouped by class. -
Inner layer
groupingBy
: In each class, it is then grouped by gender.
Result structure:
{ "Class A": { "Male": [...], "Female": [...] }, "Class B": { "Male": [...], ... }, ... }
4.3 Statistics and aggregation operations
In addition to grouping, data within each group can also be counted and aggregated. For example, calculate the average age for each class.
Example: Calculate the average age by class
Map<String, Double> averageAgeByClass = () .collect(( Student::getClassName, (Student::getAge) ));
explain:
-
()
: Used to calculate the average value of an integer attribute in each group. -
result: Get one
Map<String, Double>
, where the key is the class name and the value is the average age of students in that class.
5. Common application scenarios
5.1 Statistics order quantity grouped by region
Suppose there is an e-commerce platform that requires counting the order quantity in each region.
List<Order> orders = ...; // Order list Map<String, Long> orderCountByRegion = () .collect(( Order::getRegion, () ));
5.2 Calculate sales by product category
The total sales of each product category need to be counted.
List<ProductSale> sales = ...; // Sales record list Map<String, Double> totalSalesByCategory = () .collect(( ProductSale::getCategory, (ProductSale::getAmount) ));
5.3 Analyze user behaviors grouped by time period
It is necessary to analyze the access time distribution of website users.
List<UserVisit> visits = ...; // User access record list Map<String, List<UserVisit>> visitsByTimeSlot = () .collect((visit -> { LocalTime time = (); if (((12, 0))) { return "Morning"; } else if (((18, 0))) { return "Afternoon"; } else { return "Evening"; } }));
6. Things to note
6.1 Null value processing
If the grouping key of some elements isnull
, by default they will be placed in a special"null"
key in the list.
To avoid this or to perform special processing, custom null value processing logic can be provided when grouping.
Example: Handling null grouping keys
Map<String, List<Student>> groupedStudents = () .collect(( student -> { String className = (); return className != null ? className : "Unknown Class"; } ));
6.2 Performance considerations
For large data sets, grouping operations may consume more memory and computing resources. Therefore, when processing large-scale data, you need to pay attention to performance optimization.
- Avoid complex grouping logic: Try to use simple and efficient grouping key extraction functions.
- Parallel flow: If the hardware supports it, consider converting Stream to parallel streams to improve processing speed. For example:
Map<String, List<Student>> groupedStudents = () .collect((student -> ()));
7. Summary
Through this tutorial, you should have mastered how to use the Stream API in Javagroup by
Methods group and count data. Whether in simple classification or complex multi-level grouping scenarios, the Stream API provides efficient and concise solutions.
Hopefully this knowledge can help you better handle data grouping requirements in actual development!
Continue to study in depth?
If you want to further improve your Java skills, consider learning the following:
- New Java 8+ Features: Master Lambda expressions, functional interfaces, etc.
-
Advanced Streaming Operation Tips:learn
Collectors
various usages and performance optimization methods. - Data processing framework: Such as Apache Flink, Spark, etc., used to process larger data.
The above is personal experience. I hope you can give you a reference and I hope you can support me more.