1 程序員對什么數(shù)據(jù)類型操作做多丹壕?
毋庸置疑捌木,那就是集合類的數(shù)據(jù)類型。不管是LIST,MAP,SET或者是python的字典职辅。
2 List的相關操作java流操作:
場景一 java8的LIST和map進行按某個條件分組,然后根據(jù)特定字段去重聂示,最后統(tǒng)計去重后每組的個數(shù)
import java.util.*;
public class GroupByExample {
public static void main(String[] args) {
List<Person> list = new ArrayList<>();
list.add(new Person("John", "Male", 20));
list.add(new Person("Alice", "Female", 18));
list.add(new Person("Bob", "Male", 20));
list.add(new Person("Carol", "Female", 18));
list.add(new Person("David", "Male", 20));
// 根據(jù)某個字段分組域携,并返回Map<key,List<Object>>的數(shù)據(jù)格式
// Group by gender
Map<String, List<Person>> genderGroup = list.stream().collect(
Collectors.groupingBy(Person::getGender));
// 根據(jù)某個字段分組,并返回Map<key,Integer>的計數(shù)格式鱼喉,也就是拿到這個key有多少條聚合的數(shù)據(jù)秀鞭。
// Count the number of persons in each group
Map<String, Long> countByGender = list.stream().collect(
Collectors.groupingBy(Person::getGender, Collectors.counting()));
//多個字段進行分組,并返回Map<key,Integer>的計數(shù)格式扛禽,也就是拿多個字段進行組合分組
// Group by gender and age
Map<String, Map<Integer, List<Person>>> ageGroup = list.stream().collect(
Collectors.groupingBy(Person::getGender,
Collectors.groupingBy(Person::getAge)));
// Count the number of persons in each gender and age group
Map<String, Map<Integer, Long>> countByGenderAndAge = list.stream().collect(
Collectors.groupingBy(Person::getGender,
Collectors.groupingBy(Person::getAge, Collectors.counting())));
//按性別分組锋边,然后根據(jù)name去重
// Group by gender and remove duplicates based on name
Map<String, List<Person>> distinctNameByGender = list.stream().collect(
Collectors.groupingBy(Person::getGender,
Collectors.collectingAndThen(
toCollection(() ->
new TreeSet<>(Comparator.comparing(Person::getName))
),ArrayList::new)
));
}
}
POJO對象
class Person {
private String name;
private String gender;
private int age;
public Person(String name, String gender, int age) {
this.name = name;
this.gender = gender;
this.age = age;
}
// 省略setter and getter
}
JAVA LIST 多個字段 group by的時候,我一般喜歡封裝一個方法编曼,加一個連接符來處理豆巨。比如我有一個對象,叫
ActualSortingLog掐场,當前分揀日志往扔, 我希望根據(jù)sortingTime和PipeLine進行分組。那就可以創(chuàng)建一個叫
fetchGroupKey的方法
/**
*
* 根據(jù)sortingTime和PipeLine進行分組
* @param actualSortingLog
* @return
*/
private String fetchGroupKey(ActualSortingLog actualSortingLog) {
return actualSortingLog.getSortingTime() +"#"+ user.getPipeLine();
}
這樣好處是解耦熊户,也方便擴展萍膛,代碼也可讀。進一步敏弃,如果希望對字段做一些處理卦羡,再分組,也就簡單很多麦到。比如
這邊進一步绿饵,希望按分鐘和pipeLine字段的前3位聚合,同時時間格式變?yōu)閥yyyMMddHHmm這種瓶颠,則代碼如下:
/**
* 按分鐘線加pipeline前三位進行聚合
*
* @param actualSortingInfoDTO
* @return
*/
private String makeGroupKeyMinuteWithDeviceCode(ActualSortingInfoDTO actualSortingInfoDTO) {
DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("yyyyMMddHHmm");
return dateTimeFormatter.format(LocalDateTimeUtils.longToLocalTime(actualSortingInfoDTO.getSortTime())) + SEPARATOR +
filterDeviceCode(actualSortingInfoDTO.getPipeline());
}
當然這里我又封裝了時間工具LocalDateTimeUtils類和設備編碼切割類filterDeviceCode拟赊,是因為這些細碎的邏輯在后續(xù)的切割和分組中會經常用到。分開也有利于測試和管理粹淋。這里就不展開講吸祟。
再來說一說List的去重邏輯,首先是簡單去重
@Test
@DisplayName("list去重測試")
void testDuplicate() {
ActualSortingInfoDTO mockData = mockData("001223", "1", 1677037037000L);
ActualSortingInfoDTO mockData2 = mockData("002331", "2", 1677037037000L);
// 模擬一個重復的sortingId,應該會去重
ActualSortingInfoDTO duplicateId = mockData("002331", "2", 1677037037001L);
List<ActualSortingInfoDTO> list = new ArrayList<>();
list.add(mockData);
list.add(mockData2);
list.add(duplicateId);
List<ActualSortingInfoDTO> distinctList = list.stream().collect(Collectors.collectingAndThen(
Collectors.toCollection(simpleTreeSetSupplier()),
ArrayList::new));
log.info("簡單去重的數(shù)據(jù)" + distinctList);
// 復雜邏輯去重桃移,比如我希望根據(jù)pipeLine的前三位的值是001來去重
List<ActualSortingInfoDTO> complexDistinctList = list.stream().collect(Collectors.collectingAndThen(
Collectors.toCollection(distinctPipeLine()),ArrayList::new
));
log.info("復雜去重的數(shù)據(jù)" + complexDistinctList);
// list to map,復雜邏輯去重屋匕,比如我希望根據(jù)pipeLine的前三位的值是001和sortingId不等于2來去重,應該是保留第一條的數(shù)據(jù)mockData
Map<String, List<ActualSortingInfoDTO>> distinctMap = ListStreamUtil.group(complexDistinctList,this::makeGroupKeyMinuteWithDeviceCode);
Assertions.assertEquals(
distinctMap.get(makeGroupKeyMinuteWithDeviceCode(mockData)).get(0).getPipeline(),"001223");
Assertions.assertEquals(
distinctMap.get(makeGroupKeyMinuteWithDeviceCode(mockData)).get(0).getSortingId(),"1");
distinctMap.forEach((k, v) -> log.info("分組后:" + k + " " + v));
}
這里也建議對復雜的去重方法進行封裝借杰,比如我這邊封裝了一個方法过吻,叫distinctPipeLine,后續(xù)就可以自定義各種去重邏輯了。
自定義去重方法如下:
private Supplier<TreeSet<ActualSortingInfoDTO>> distinctPipeLine() {
return () -> new TreeSet<>(
Comparator.comparing(actualSortingInfoDTO ->
actualSortingInfoDTO.getPipeline().equals("001")
&& !Objects.equals(actualSortingInfoDTO.getSortingId(), "2")));
}