玩轉(zhuǎn)Java8Stream（三尼啡、Collectors收集器）

之前的文章中也提到了霎终，Stream 的核心在于Collectors，即對處理后的數(shù)據(jù)進行收集升薯。Collectors 提供了非常多且強大的API莱褒，可以將最終的數(shù)據(jù)收集成List、Set涎劈、Map广凸，甚至是更復雜的結(jié)構(gòu)(這三者的嵌套組合)。

Collectors 提供了很多API蛛枚，有很多都是一些函數(shù)的重載谅海，這里我個人將其分為三大類，如下：

數(shù)據(jù)收集：set蹦浦、map扭吁、list

聚合歸約：統(tǒng)計、求和、最值智末、平均谅摄、字符串拼接、規(guī)約

前后處理：分區(qū)系馆、分組送漠、自定義操作

API 使用

這里會講到一些常用API 的用法，不會講解所有API由蘑，因為真的是太多了闽寡，而且各種API的組合操作起來太可怕太復雜了。

數(shù)據(jù)收集

Collectors.toCollection() 將數(shù)據(jù)轉(zhuǎn)成Collection尼酿，只要是Collection 的實現(xiàn)都可以爷狈，例如ArrayList、HashSet 裳擎，該方法接受一個Collection 的實現(xiàn)對象或者說Collection 工廠的入?yún)ⅰ?/p>
示例：
```
                 //List
        Stream.of(1,2,3,4,5,6,8,9,0)
                .collect(Collectors.toCollection(ArrayList::new));
        
        //Set
        Stream.of(1,2,3,4,5,6,8,9,0)
                .collect(Collectors.toCollection(HashSet::new));
```

Collectors.toList()和Collectors.toSet() 其實和Collectors.toCollection() 差不多涎永，只是指定了容器的類型，默認使用ArrayList 和 HashSet鹿响。本來我以為這兩個方法的內(nèi)部會使用到Collectors.toCollection()羡微，結(jié)果并不是，而是在內(nèi)部new了一個CollectorImpl惶我。

預期：

     public static <T>
    Collector<T, ?, List<T>> toList() {
        return toCollection(ArrayList::new);
    }

   
    public static <T>
    Collector<T, ?, Set<T>> toSet() {
        return new toCollection(HashSet::new);
    }

實際：

     public static <T>
    Collector<T, ?, List<T>> toList() {
        return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
                                   (left, right) -> { left.addAll(right); return left; },
                                   CH_ID);
    }
    
    public static <T>
    Collector<T, ?, Set<T>> toSet() {
        return new CollectorImpl<>((Supplier<Set<T>>) HashSet::new, Set::add,
                                   (left, right) -> { left.addAll(right); return left; },
                                   CH_UNORDERED_ID);
    }

剛開始真是不知道作者是怎么想的妈倔，后來發(fā)現(xiàn)CollectorImpl 是需要一個Set<Collector.Characteristics>(特征集合)的東西，由于Set 是無序的绸贡，在toSet()方法中的實現(xiàn)傳入了CH_UNORDERED_ID盯蝴，但是toCollection()方法默都是CH_ID，難道是說在使用toCollecion()方法時不建議傳入Set類型听怕？如果有人知道的話捧挺，麻煩你告訴我一下。

示例：

             //List
        Stream.of(1,2,3,4,5,6,8,9,0)
                .collect(Collectors.toList());

        //Set
        Stream.of(1,2,3,4,5,6,8,9,0)
                .collect(Collectors.toSet());

Collectors.toMap() 和Collectors.toConcurrentMap()叉跛，見名知義松忍，收集成Map和ConcurrentMap蒸殿，默認使用HashMap和ConcurrentHashMap筷厘。這里toConcurrentMap()是可以支持并行收集的，這兩種類型都有三個重載方法宏所，不管是Map 還是ConcurrentMap酥艳，他們和Collection的區(qū)別是Map 是K-V 形式的，所以在收集成Map的時候必須指定收集的K(依據(jù))爬骤。這里toMap()和toConcurrentMap() 最少參數(shù)是充石，key的獲取，要存的value霞玄。

示例：這里以Student 這個結(jié)構(gòu)為例骤铃，Student 包含 id拉岁、name。

public class Student{

        //唯一
        private String id;

        private String name;

        public Student() {
        }

        public Student(String id, String name) {
            this.id = id;
            this.name = name;
        }

        public String getId() {
            return id;
        }

        public void setId(String id) {
            this.id = id;
        }

        public String getName() {
            return name;
        }

        public void setName(String name) {
            this.name = name;
        }
    }

說明：這里制定k 為 id惰爬，value 既可以是對象本身喊暖，也可以指定對象的某個字段∷呵疲可見陵叽，map的收集自定義性非常高。

                 
        Student studentA = new Student("20190001","小明");
        Student studentB = new Student("20190002","小紅");
        Student studentC = new Student("20190003","小丁");


        //Function.identity() 獲取這個對象本身丛版，那么結(jié)果就是Map<String,Student> 即 id->student
        //串行收集
     Stream.of(studentA,studentB,studentC)
                .collect(Collectors.toMap(Student::getId,Function.identity()));

        //并發(fā)收集
        Stream.of(studentA,studentB,studentC)
                .parallel()
                .collect(Collectors.toConcurrentMap(Student::getId,Function.identity()));

        //================================================================================

        //Map<String,String> 即 id->name
        //串行收集
        Stream.of(studentA,studentB,studentC)
                .collect(Collectors.toMap(Student::getId,Student::getName));

        //并發(fā)收集
        Stream.of(studentA,studentB,studentC)
                .parallel()
                .collect(Collectors.toConcurrentMap(Student::getId,Student::getName));

那么如果key重復的該怎么處理巩掺？這里我們假設(shè)有兩個id相同Student，如果他們id相同页畦，在轉(zhuǎn)成Map的時候胖替，取name大一個，小的將會被丟棄豫缨。

                 //Map<String,Student>
        Stream.of(studentA, studentB, studentC)
                .collect(Collectors
                        .toMap(Student::getId,
                                Function.identity(),
                                BinaryOperator
                                        .maxBy(Comparator.comparing(Student::getName))));

        
        //可能上面比較復雜刊殉，這編寫一個命令式
        //Map<String,Student>
        Stream.of(studentA, studentB, studentC)
                .collect(Collectors
                        .toMap(Student::getId,
                                Function.identity(),
                                (s1, s2) -> {
                            
                                    //這里使用compareTo 方法 s1>s2 會返回1,s1==s2 返回0 ，否則返回-1
                                    if (((Student) s1).name.compareTo(((Student) s2).name) < -1) {
                                        return s2;
                                    } else {
                                        return s1;
                                    }
                                }));

如果不想使用默認的HashMap 或者 ConcurrentHashMap , 第三個重載方法還可以使用自定義的Map對象(Map工廠)州胳。

              //自定義LinkedHashMap
        //Map<String,Student>
        Stream.of(studentA, studentB, studentC)
                .collect(Collectors
                        .toMap(Student::getId,
                                Function.identity(),
                                BinaryOperator
                                        .maxBy(Comparator.comparing(Student::getName)),
                                LinkedHashMap::new));

聚合歸約

Collectors.joining()记焊，拼接，有三個重載方法栓撞，底層實現(xiàn)是StringBuilder遍膜，通過append方法拼接到一起，并且可以自定義分隔符（這個感覺還是很有用的瓤湘，很多時候需要把一個list轉(zhuǎn)成一個String瓢颅，指定分隔符就可以實現(xiàn)了，非常方便）弛说、前綴挽懦、后綴。

示例：

     
                 Student studentA = new Student("20190001", "小明");
        Student studentB = new Student("20190002", "小紅");
        Student studentC = new Student("20190003", "小丁");

             //使用分隔符：201900012019000220190003
        Stream.of(studentA, studentB, studentC)
                .map(Student::getId)
                .collect(Collectors.joining());

        //使用^_^ 作為分隔符
        //20190001^_^20190002^_^20190003
        Stream.of(studentA, studentB, studentC)
                .map(Student::getId)
                .collect(Collectors.joining("^_^"));

        //使用^_^ 作為分隔符
        //[]作為前后綴
        //[20190001^_^20190002^_^20190003]
        Stream.of(studentA, studentB, studentC)
                .map(Student::getId)
                .collect(Collectors.joining("^_^", "[", "]"));

Collectors.counting() 統(tǒng)計元素個數(shù)木人，這個和Stream.count() 作用都是一樣的信柿，返回的類型一個是包裝Long，另一個是基本long醒第，但是他們的使用場景還是有區(qū)別的渔嚷，這個后面再提。

示例：

             // Long 8
        Stream.of(1,0,-10,9,8,100,200,-80)
                .collect(Collectors.counting());
        
        //如果僅僅只是為了統(tǒng)計稠曼，那就沒必要使用Collectors了形病，那樣更消耗資源
        // long 8
        Stream.of(1,0,-10,9,8,100,200,-80)
                .count();

Collectors.minBy()、Collectors.maxBy() 和Stream.min()、Stream.max() 作用也是一樣的漠吻，只不過Collectors.minBy()量瓜、Collectors.maxBy()適用于高級場景。

示例：

             // maxBy 200
        Stream.of(1, 0, -10, 9, 8, 100, 200, -80)
                .collect(Collectors.maxBy(Integer::compareTo)).ifPresent(System.out::println);

        // max 200
        Stream.of(1, 0, -10, 9, 8, 100, 200, -80)
                .max(Integer::compareTo).ifPresent(System.out::println);

        // minBy -80
        Stream.of(1, 0, -10, 9, 8, 100, 200, -80)
                .collect(Collectors.minBy(Integer::compareTo)).ifPresent(System.out::println);

        // min -80
        Stream.of(1, 0, -10, 9, 8, 100, 200, -80)
                .min(Integer::compareTo).ifPresent(System.out::println);

Collectors.summingInt()途乃、Collectors.summarizingLong()榔至、Collectors.summarizingDouble() 這三個分別用于int、long欺劳、double類型數(shù)據(jù)一個求總操作唧取，返回的是一個SummaryStatistics(求總)，包含了數(shù)量統(tǒng)計count划提、求和sum枫弟、最小值min、平均值average鹏往、最大值max淡诗。雖然IntStream、DoubleStream伊履、LongStream 都可以是求和sum 但是也僅僅只是求和韩容，沒有summing結(jié)果豐富。如果要一次性統(tǒng)計唐瀑、求平均值什么的群凶，summing還是非常方便的。

示例：

             //IntSummaryStatistics{count=10, sum=55, min=1, average=5.500000, max=10}
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.summarizingInt(Integer::valueOf));

        //DoubleSummaryStatistics{count=10, sum=55.000000, min=1.000000, average=5.500000, max=10.000000}
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.summarizingDouble(Double::valueOf));

        //LongSummaryStatistics{count=10, sum=55, min=1, average=5.500000, max=10}
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.summarizingLong(Long::valueOf));


        // 55
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).mapToInt(Integer::valueOf)
                .sum();

        // 55.0
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).mapToDouble(Double::valueOf)
                .sum();

        // 55
        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).mapToLong(Long::valueOf)
                .sum();

Collectors.averagingInt()哄辣、Collectors.averagingDouble()请梢、Collectors.averagingLong() 求平均值，適用于高級場景力穗，這個后面再提毅弧。

示例：

             Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.averagingInt(Integer::valueOf));

        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.averagingDouble(Double::valueOf));

        Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
                .collect(Collectors.averagingLong(Long::valueOf));

Collectors.reducing() 好像也和Stream.reduce()差不多，也都是規(guī)約操作当窗。其實Collectors.counting() 就是用reducing()實現(xiàn)的够坐，如代碼所示：

public static <T> Collector<T, ?, Long> counting() {
        return reducing(0L, e -> 1L, Long::sum);
    }

那既然這樣的話，我們就實現(xiàn)一個對所有學生名字長度求和規(guī)約操作崖面。

示例：

                 //Optional[6]
        Stream.of(studentA, studentB, studentC)
                .map(student -> student.name.length())
                .collect(Collectors.reducing(Integer::sum));

        //6
        //或者這樣元咙，指定初始值，這樣可以防止沒有元素的情況下正常執(zhí)行
        Stream.of(studentA, studentB, studentC)
                .map(student -> student.name.length())
                .collect(Collectors.reducing(0, (i1, i2) -> i1 + i2));


        //6
        //更或者先不轉(zhuǎn)換嘶朱，規(guī)約的時候再轉(zhuǎn)換
        Stream.of(studentA, studentB, studentC)
                .collect(Collectors.reducing(0, s -> ((Student) s).getName().length(), Integer::sum));

前后處理

Collectors.groupingBy()和Collectors.groupingByConcurrent()蛾坯，這兩者區(qū)別也僅是單線程和多線程的使用場景。為什么要groupingBy歸類為前后處理呢疏遏？groupingBy 是在數(shù)據(jù)收集前分組的，再將分好組的數(shù)據(jù)傳遞給下游的收集器。

這是 groupingBy最長的參數(shù)的函數(shù)classifier 是分類器财异，mapFactory map的工廠倘零，downstream下游的收集器，正是downstream 的存在戳寸，可以在數(shù)據(jù)傳遞個下游之前做很多的騷操作呈驶。

public static <T, K, D, A, M extends Map<K, D>>
    Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier,
                                  Supplier<M> mapFactory,
                                  Collector<? super T, A, D> downstream)

示例：這里將一組數(shù)整型數(shù)分為正數(shù)、負數(shù)疫鹊、零袖瞻，groupingByConcurrent()的參數(shù)也是跟它一樣的就不舉例了。

             //Map<String,List<Integer>>
        Stream.of(-6, -7, -8, -9, 1, 2, 3, 4, 5, 6)
                .collect(Collectors.groupingBy(integer -> {
                    if (integer < 0) {
                        return "小于";
                    } else if (integer == 0) {
                        return "等于";
                    } else {
                        return "大于";
                    }
                }));

        //Map<String,Set<Integer>>
        //自定義下游收集器
        Stream.of(-6, -7, -8, -9, 1, 2, 3, 4, 5, 6)
                .collect(Collectors.groupingBy(integer -> {
                    if (integer < 0) {
                        return "小于";
                    } else if (integer == 0) {
                        return "等于";
                    } else {
                        return "大于";
                    }
                },Collectors.toSet()));

        //Map<String,Set<Integer>>
        //自定義map容器 和 下游收集器
        Stream.of(-6, -7, -8, -9, 1, 2, 3, 4, 5, 6)
                .collect(Collectors.groupingBy(integer -> {
                    if (integer < 0) {
                        return "小于";
                    } else if (integer == 0) {
                        return "等于";
                    } else {
                        return "大于";
                    }
                },LinkedHashMap::new,Collectors.toSet()));

Collectors.partitioningBy() 字面意思話就叫分區(qū)好了拆吆，但是partitioningBy最多只能將數(shù)據(jù)分為兩部分聋迎，因為partitioningBy分區(qū)的依據(jù)Predicate，而Predicate只會有true 和false 兩種結(jié)果枣耀，所有partitioningBy最多只能將數(shù)據(jù)分為兩組霉晕。partitioningBy除了分類器與groupingBy 不一樣外，其他的參數(shù)都相同捞奕。

示例：
```
 //Map<Boolean,List<Integer>>
        Stream.of(0,1,0,1)
                .collect(Collectors.partitioningBy(integer -> integer==0));

        //Map<Boolean,Set<Integer>>
        //自定義下游收集器
        Stream.of(0,1,0,1)
                .collect(Collectors.partitioningBy(integer -> integer==0,Collectors.toSet()));
```

Collectors.mapping() 可以自定義要收集的字段牺堰。

示例：

         //List<String>
        Stream.of(studentA,studentB,studentC)
                .collect(Collectors.mapping(Student::getName,Collectors.toList()));

Collectors.collectingAndThen()收集后操作，如果你要在收集數(shù)據(jù)后再做一些操作颅围，那么這個就非常有用了伟葫。

示例：這里在收集后轉(zhuǎn)成了listIterator，只是個簡單的示例院促，具體的實現(xiàn)邏輯非常有待想象扒俯。
```
//listIterator 
Stream.of(studentA,studentB,studentC)
                .collect(Collectors.collectingAndThen(Collectors.toList(),List::listIterator));
```

總結(jié)

Collectors.作為Stream的核心，工能豐富強大一疯，在我所寫的業(yè)務(wù)代碼中撼玄，幾乎沒有Collectors 完不成的，實在太難墩邀，只要多想想掌猛，多試試這些API的組合，相信還是可以用Collectors來完成的眉睹。
之前為了寫個排序的id荔茬，我花了差不多6個小時去組合這些API，但還好寫出來了竹海。這是我寫業(yè)務(wù)時某個復雜的操作

image.png

還有一點就是慕蔚，像Stream操作符中與Collectors.中類似的收集器功能，如果能用Steam的操作符就去使用斋配，這樣可以降低系統(tǒng)開銷孔飒。

玩轉(zhuǎn)Java8Stream（三、Collectors收集器）