Stream Collectors.groupingBy的四種用法解決分組統(tǒng)計（計數蜈首、求和说敏、平均數等）和簸、范圍統(tǒng)計帅戒、分組合并丑瞧、分組結果自定義映射等問題

前言
近期雌团，由于業(yè)務需要燃领，會統(tǒng)計一些簡單的頁面指標，如果每個統(tǒng)計都通過SQL實現的話锦援，又略感枯燥乏味猛蔽。于是選擇使用Stream的分組功能。對于這些簡單的統(tǒng)計指標來說灵寺，Stream的分組更為靈活曼库，只需要提取出需要統(tǒng)計的數據，便可以對這些數據進行任意處理略板，而無需再次編寫不同的SQL去統(tǒng)計不同的指標毁枯。

此文主要是總結我在此前的工作中使用到的Collectors.groupingBy的一些方法和技巧。根據平時使用的習慣叮称，將Collectors.groupingBy的功能大致分為四種种玛，但這種界定都是模糊的胀糜，并不是絕對，每種功能都可以穿插使用蒂誉，這里只是更方便了解Collectors.groupingBy各個方法的使用規(guī)則。

四種分組功能如下：

基礎分組功能
分組統(tǒng)計功能
分組合并功能
分組自定義映射功能

Stream的其它用法可以參考下文：

超詳細的Java8 Stream使用方法：篩選距帅、排序右锨、最大值、最小值碌秸、計數求和平均數绍移、分組、合并讥电、映射蹂窖、去重等

語法說明
基礎語法
Collector<T, ?, Map<K, List<T>>> groupingBy(Function<? super T, ? extends K> classifier)

Collector<T, ?, Map<K, D>> groupingBy(Function<? super T, ? extends K> classifier, Collector<? super T, A, D> downstream)

Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier, Supplier<M> mapFactory, Collector<? super T, A, D> downstream)

classifier：鍵映射：該方法的返回值是鍵值對的鍵
mapFactory：無參構造函數提供返回類型：提供一個容器初始化方法，用于創(chuàng)建新的 Map容器（使用該容器存放值對）恩敌。
downstream：值映射：通過聚合方法將同鍵下的結果聚合為指定類型瞬测，該方法返回的是鍵值對的值。

前置數據
List<Student> students = Stream.of(
Student.builder().name("小張").age(16).clazz("高一1班").course("歷史").score(88).build(),
Student.builder().name("小李").age(16).clazz("高一3班").course("數學").score(12).build(),
Student.builder().name("小王").age(17).clazz("高二1班").course("地理").score(44).build(),
Student.builder().name("小紅").age(18).clazz("高二1班").course("物理").score(67).build(),
Student.builder().name("李華").age(15).clazz("高二2班").course("數學").score(99).build(),
Student.builder().name("小潘").age(19).clazz("高三4班").course("英語").score(100).build(),
Student.builder().name("小聶").age(20).clazz("高三4班").course("物理").score(32).build()
).collect(Collectors.toList());
1
2
3
4
5
6
7
8
9

分組的4種使用方法

基礎分組功能
說明：基礎功能纠炮，分組并返回Map容器月趟。將用戶自定義的元素作為鍵，同時將鍵相同的元素存放在List中作為值恢口。

Collectors.groupingBy：基礎分組功能
下面的寫法都是等價的

// 將不同課程的學生進行分類
Map<String, List<Student>> groupByCourse = students.stream().collect(Collectors.groupingBy(Student::getCourse));
Map<String, List<Student>> groupByCourse1 = students.stream().collect(Collectors.groupingBy(Student::getCourse, Collectors.toList()));
// 上面的方法中容器類型和值類型都是默認指定的孝宗，容器類型為：HashMap，值類型為：ArrayList
// 可以通過下面的方法自定義返回結果耕肩、值的類型
Map<String, List<Student>> groupByCourse2 = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, HashMap::new, Collectors.toList()));
1
2
3
4
5
6
7
第三種寫法可以自定義鍵類型因妇、容器類型、值類型猿诸。
如需要保證students分組后的有序性的話婚被，那么可以自定義容器類型為LinkedHashMap。

后文的其它三個功能點两芳，都基于第三個參數：Collector<? super T, A, D> downstream摔寨。換句話說，他們都是通過實現Collector接口來實現各種downstream操作的怖辆。

分組統(tǒng)計功能
說明：分組后是复，對同一分組內的元素進行計算：計數、平均值竖螃、求和淑廊、最大最小值、范圍內數據統(tǒng)計特咆。

Collectors.counting：計數
計數語法：
Collector<T, ?, Long> counting()

// 計數
Map<String, Long> groupCount = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.counting()));
1
2
3

Collectors.summingInt：求和
求和語法：
Collector<T, ?, Integer> summingInt(ToIntFunction<? super T> mapper)
Collector<T, ?, Long> summingLong(ToLongFunction<? super T> mapper)
Collector<T, ?, Double> summingDouble(ToDoubleFunction<? super T> mapper)

求和針對流中元素類型的不同季惩，分別提供了三種計算方式：Int录粱、Double、Long画拾。計算方式與計算結果必須與元素類型匹配啥繁。

// 求和
Map<String, Integer> groupSum = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.summingInt(Student::getScore)));
1
2
3

Collectors.averagingInt：平均值
平均值語法：
Collector<T, ?, Double> averagingInt(ToIntFunction<? super T> mapper)
Collector<T, ?, Double> averagingLong(ToLongFunction<? super T> mapper)
Collector<T, ?, Double> averagingDouble(ToDoubleFunction<? super T> mapper)

平均值計算關注點：

平均值有三種計算方式：Int、Double青抛、Long旗闽。
計算方式僅對計算結果的精度有影響。
計算結果始終返回Double蜜另。
// 增加平均值計算
Map<String, Double> groupAverage = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.averagingInt(Student::getScore)));
1
2
3

Collectors.minBy：最大最小值
最大最少值語法：
Collector<T, ?, Optional<T>> minBy(Comparator<? super T> comparator)
Collector<T, ?, Optional<T>> maxBy(Comparator<? super T> comparator)

Collectors.collectingAndThen語法：
Collector<T,A,RR> collectingAndThen(Collector<T,A,R> downstream, Function<R,RR> finisher)

Function<R,RR>：提供參數類型為R适室，返回結果類型為RR。
Collectors.minBy方法返回的類型為Optional<T>>举瑰，在取數據時還需要校驗Optional是否為空捣辆。

不過這一步可以通過Collectors.collectingAndThen方法實現，并返回校驗結果此迅。Collectors.collectingAndThen的作用便是在使用聚合函數之后汽畴，對聚合函數的結果進行再加工佛寿。

// 同組最小值
Map<String, Optional<Student>> groupMin = students.stream()
.collect(Collectors.groupingBy(Student::getCourse,Collectors.minBy(Comparator.comparing(Student::getCourse))));
// 使用Collectors.collectingAndThen方法仲闽，處理Optional類型的數據
Map<String, Student> groupMin2 = students.stream()
.collect(Collectors.groupingBy(Student::getCourse,
Collectors.collectingAndThen(Collectors.minBy(Comparator.comparing(Student::getCourse)), op ->op.orElse(null))));
// 同組最大值
Map<String, Optional<Student>> groupMax = students.stream()
.collect(Collectors.groupingBy(Student::getCourse,Collectors.maxBy(Comparator.comparing(Student::getCourse))));
1
2
3
4
5
6
7
8
9
10

Collectors.summarizingInt：完整統(tǒng)計（同時獲取以上的全部統(tǒng)計結果）
完整統(tǒng)計語法：
Collector<T, ?, IntSummaryStatistics> summarizingInt(ToIntFunction<? super T> mapper)
Collector<T, ?, LongSummaryStatistics> summarizingLong(ToLongFunction<? super T> mapper)
Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper)

統(tǒng)計方法提供了三種計算方式：Int糕珊、Double糠惫、Long蚪缀。它會將輸入元素轉為上述三種計算方式的基本類型唱歧，然后進行計算亏栈。Collectors.summarizingXXX方法可以計算一般統(tǒng)計所需的所有結果溉浙。

無法向下轉型芋忿，即Long無法轉Int等炸客。

返回結果取決于用的哪種計算方式。

// 統(tǒng)計方法同時統(tǒng)計同組的最大值戈钢、最小值痹仙、計數、求和殉了、平均數信息
HashMap<String, IntSummaryStatistics> groupStat = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, HashMap::new,Collectors.summarizingInt(Student::getScore)));
groupStat.forEach((k, v) -> {
// 返回結果取決于用的哪種計算方式
v.getAverage();
v.getCount();
v.getMax();
v.getMin();
v.getSum();
});
1
2
3
4
5
6
7
8
9
10
11

Collectors.partitioningBy：范圍統(tǒng)計
Collectors.partitioningBy語法：
Collector<T, ?, Map<Boolean, D>> partitioningBy(Predicate<? super T> predicate)
Collector<T, ?, Map<Boolean, D>> partitioningBy(Predicate<? super T> predicate, Collector<? super T, A, D> downstream)

predicate：條件參數开仰，對分組的結果劃分為兩個范圍。
上面的統(tǒng)計都是基于某個指標項的薪铜。如果我們需要統(tǒng)計范圍众弓，比如：得分大于、小于60分的人的信息隔箍，那么我們可以通過Collectors.partitioningBy方法對映射結果進一步切分

// 切分結果谓娃，同時統(tǒng)計大于60和小于60分的人的信息
Map<String, Map<Boolean, List<Student>>> groupPartition = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.partitioningBy(s -> s.getScore() > 60)));
// 同樣的，我們還可以對上面兩個分組的人數數據進行統(tǒng)計
Map<String, Map<Boolean, Long>> groupPartitionCount = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.partitioningBy(s -> s.getScore() > 60, Collectors.counting())));

1
2
3
4
5
6
7
Collectors.partitioningBy僅支持將數據劃分為兩個范圍進行統(tǒng)計蜒滩，如果需要劃分多個滨达，可以嵌套Collectors.partitioningBy執(zhí)行奶稠，不過需要在執(zhí)行完后，手動處理不需要的數據捡遍。也可以在第一次Collectors.partitioningBy獲取結果后锌订，再分別對該結果進行范圍統(tǒng)計。

Map<String, Map<Boolean, Map<Boolean, List<Student>>>> groupAngPartitionCount = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.partitioningBy(s -> s.getScore() > 60,
Collectors.partitioningBy(s -> s.getScore() > 90))));
1
2
3

分組合并功能
說明：將同一個鍵下的值画株，通過不同的方法最后合并為一條數據瀑志。

Collectors.reducing：合并分組結果
Collectors.reducing語法：
Collector<T, ?, Optional> reducing(BinaryOperator op)
Collector<T, ?, T> reducing(T identity, BinaryOperator op)
Collector<T, ?, U> reducing(U identity, Function<? super T, ? extends U> mapper, BinaryOperator op)

identity：合并標識值（因子），它將參與累加函數和合并函數的運算（即提供一個默認值污秆，在流為空時返回該值，當流不為空時昧甘，該值作為起始值良拼，參與每一次累加或合并計算）
mapper：映射流中的某個元素，并根據此元素進行合并充边。
op：合并函數庸推，將mapper映射的元素，進行兩兩合并浇冰，最初的一個元素將于合并標識值進行合并贬媒。
// 合并結果，計算每科總分
Map<String, Integer> groupCalcSum = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.reducing(0, Student::getScore, Integer::sum)));
// 合并結果肘习，獲取每科最高分的學生信息
Map<String, Optional<Student>> groupCourseMax = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.reducing(BinaryOperator.maxBy(Comparator.comparing(Student::getScore)))));
1
2
3
4
5
6

Collectors.joining：合并字符串
Collectors.joining語法：
Collector<CharSequence, ?, String> joining()
Collector<CharSequence, ?, String> joining(CharSequence delimiter)
Collector<CharSequence, ?, String> joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix)

delimiter：分隔符
prefix：每個字符的前綴
suffix：每個字符的后綴
Collectors.joining只能對字符進行操作际乘，因此一般會與其它downstream方法組合使用。

// 統(tǒng)計各科的學生姓名
Map<String, String> groupCourseSelectSimpleStudent = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.mapping(Student::getName, Collectors.joining(","))));
1
2
3

分組自定義映射功能
說明：實際上Collectors.groupingBy的第三個參數downstream漂佩，其實就是就是將元素映射為不同的值脖含。而且上面的所有功能都是基于downstream的。這一節(jié)投蝉，主要介紹一些方法來設置自定義值养葵。

Collectors.toXXX：映射結果為Collection對象
將結果映射為ArrayList：
Collector<T, ?, List> toList()

將結果映射為HashSet：
Collector<T, ?, Set> toSet()

將結果映射為HashMap或其他map類：
Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper, Function<? super T, ? extends U> valueMapper)
Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper, Function<? super T, ? extends U> valueMapper, BinaryOperator<U> mergeFunction)
Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper, Function<? super T, ? extends U> valueMapper, BinaryOperator<U> mergeFunction, Supplier<M> mapSupplier)

keyMapper：key映射
valueMapper：value映射
mergeFunction：當流中的key重復時，提供的合并方式瘩缆，默認情況下关拒，將會拋出IllegalStateException異常。
mapSupplier：提供Map容器的無參初始化方式庸娱，可以自定義返回的Map容器類型着绊。
Collectors.toConcurrentMap的語法同Collectors.toMap，不過他們仍然有一些區(qū)別：

前者默認返回ConcurrentHashMap涌韩，后者返回HashMap
在處理并行流中存在差異：toMap會多次調用mapSupplier畔柔，產生多個map容器，最后在通過Map.merge()合并起來臣樱，而toConcurrentMap則只會調用一次靶擦，并且該容器將會不斷接受其他線程的調用以添加鍵值對腮考。在并發(fā)情況下，toMap容器合并的性能自然是不如toConcurrentMap優(yōu)秀的玄捕。
Map<String, Map<String, Integer>> courseWithStudentScore = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.toMap(Student::getName, Student::getScore)));
Map<String, LinkedHashMap<String, Integer>> courseWithStudentScore2 = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.toMap(Student::getName, Student::getScore, (k1, k2) -> k2, LinkedHashMap::new)));
1
2
3
4

Collectors.mapping：自定義映射結果
Collectors.mapping語法：
Collector<T, ?, R> mapping(Function<? super T, ? extends U> mapper, Collector<? super U, A, R> downstream)

Collectors.mapping的功能比較豐富踩蔚，除了可以將分組結果映射為自己想要的值外，還能組合上面提到的所有downstream方法枚粘。

將結果映射為指定字段：

Map<String, List<String>> groupMapping = students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.mapping(Student::getName, Collectors.toList())));
1
2
轉換bean對象：

Map<String, List<OutstandingStudent>> groupMapping2 = students.stream()
.filter(s -> s.getScore() > 60)
.collect(Collectors.groupingBy(Student::getCourse, Collectors.mapping(s -> BeanUtil.copyProperties(s, OutstandingStudent.class), Collectors.toList())));
1
2
3
組合joining

// 組合joining
Map<String, String> groupMapperThenJoin= students.stream()
.collect(Collectors.groupingBy(Student::getCourse, Collectors.mapping(Student::getName, Collectors.joining(","))));
// 利用collectingAndThen處理joining后的結果
Map<String, String> groupMapperThenLink = students.stream()
.collect(Collectors.groupingBy(Student::getCourse,
Collectors.collectingAndThen(Collectors.mapping(Student::getName, Collectors.joining("馅闽，")), s -> "學生名單：" + s)));
1
2
3
4
5
6
7

Collector：自定義downstream
可以參考：【Java8 Stream】：探秘Stream實現的核心：Collector，模擬Stream的實現

Collector<T, A, R>范型的含義：

<T>：規(guī)約操作（reduction operation）的輸入元素類型
<A>：是規(guī)約操作的輸出結果類型馍迄，該類型是可變可累計的福也，可以是各種集合容器，或者具有累計操作（如add）的自定義對象攀圈。
<R>：規(guī)約操作結果經過轉換操作后返回的最終結果類型
Collector中方法定義暴凑，下面的方法的返回值都可以看作函數（function）：

Supplier<A> supplier()：該函數創(chuàng)建并返回新容器對象。
BiConsumer<A, T> accumulator()：該函數將把元素值放入容器對象赘来，并返回容器现喳。
BinaryOperator<A> combiner()：該函數會把兩個容器（此時每個容器都是處理流元素的部分結果）合并，該函數可以返回這兩個容器中的一個犬辰，也可以返回一個新的容器嗦篱。
Function<A, R> finisher()：該函數將執(zhí)行最終的轉換，它會將combiner的最終合并結果A轉變?yōu)镽幌缝。
Set<Characteristics> characteristics()：提供集合列表灸促，該列表將提供當前Collector的一些特征值。這些特征將會影響上述函數的表現涵卵。
上述函數的語法：

Supplier<T>#T get()：調用一個無參方法腿宰，返回一個結果。一般來說是構造方法的方法引用缘厢。
BiConsumer<T, U>#void accept(T t, U u)：根據給定的兩個參數吃度，執(zhí)行相應的操作。
BinaryOperator<T> extends BiFunction<T,T,T>#T apply(T t, T u)：合并t和u贴硫，返回其中之一椿每，或創(chuàng)建一個新對象放回。
Function<T, R>#R apply(T t)：處理給定的參數英遭，并返回一個新的值间护。
public interface Collector<T, A, R> {

Supplier<A> supplier();

BiConsumer<A, T> accumulator();

BinaryOperator<A> combiner();

Function<A, R> finisher();

Set<Characteristics> characteristics();

}

Stream Collectors.groupingBy的四種用法解決分組統(tǒng)計（計數、求和肋层、平均數等）亿笤、范圍統(tǒng)計、分組合并栋猖、分組結果自定義映射等問題

Stream Collectors.groupingBy的四種用法解決分組統(tǒng)計（計數净薛、求和、平均數等）蒲拉、范圍統(tǒng)計肃拜、分組合并、分組結果自定義映射等問題

Stream Collectors.groupingBy的四種用法 解決分組統(tǒng)計（計數净薛、求和、平均數等）蒲拉、范圍統(tǒng)計肃拜、分組合并、分組結果自定義映射等問題

Stream Collectors.groupingBy的四種用法解決分組統(tǒng)計（計數净薛、求和、平均數等）蒲拉、范圍統(tǒng)計肃拜、分組合并、分組結果自定義映射等問題