為什么需要序列化
- 數(shù)據(jù)持久化(如session信息存儲(chǔ)到redis)或在網(wǎng)絡(luò)上傳輸(如RPC遠(yuǎn)程調(diào)用)
序列化要考慮的因素
- 性能:速度越快越好
- 序列化后字節(jié)大小:字節(jié)越小越好创坞,節(jié)省帶寬和存儲(chǔ)器空間
- 兼容性:類的信息發(fā)生變化就乓,舊的序列化數(shù)據(jù)是否能正常用新類反序列化秸脱,或者反之十兢。如果序列化內(nèi)容是放在內(nèi)存并且每次發(fā)版(停服發(fā)版)都會(huì)清空洽腺,那么可以不考慮兼容武福,否則兼容性就要考慮$员#灰度發(fā)布之類的也要考慮兼容性汛闸。這里說的兼容性是指加減字段,不包括更改字段類型涮俄。
常見序列化方式
- JDK自帶的ObjectInputStream和ObjectOutputStream蛉拙,需要實(shí)現(xiàn)Serializable,需要兼容的話要寫死servialVersionUID彻亲。性能低孕锄、體積大吮廉。
- 各類json(jackson, gson, fastjson),性能比jdk稍高畸肆,體積也稍小宦芦,對(duì)人友好,基本所有主流語言都支持轴脐,跨語言性非常好调卑。兼容性好。但是類和字段的信息沒有序列化進(jìn)去大咱,在反序列化的時(shí)候需要指定類名恬涧。
- hessian:性能和字節(jié)比jdk好,兼容性差碴巾。如果子類和父類有相同的屬性名溯捆,那么在反序列化后會(huì)丟失字節(jié),原因:hessian先寫子類Field厦瓢,再寫父類Field提揍,寫值的順序也一樣,因?yàn)楦割怓ield的值一般都是null煮仇,所以在反序列化的時(shí)候劳跃,總是把最后的父類的null值覆蓋掉子類的值,具體原因參考:https://www.cnblogs.com/yfyzy/p/7197679.html浙垫。hessian的一些類不是public刨仑,不能繼承,如果要改的話只能改源碼了夹姥。
- hessian-lite:阿里dubbo項(xiàng)目里默認(rèn)用的序列化協(xié)議贸人,改自hessian,他解決了字節(jié)丟失問題佃声,就是在獲取所有Field后做下reverse操作,顛倒了Field的順序倘要。但是經(jīng)過測試發(fā)現(xiàn)heissian-lite速度太慢了圾亏,見issue:https://github.com/dubbo/hessian-lite/issues/10
- kryo:速度和性能都很好,默認(rèn)不兼容封拧,不過通過設(shè)置CompatibleFieldSerializcer就能支持兼容志鹃,但是也不允許父類和子類有相同名字的屬性,可以通過繼承過濾掉同名屬性泽西。kryo可以參考官方文檔曹铃,https://github.com/EsotericSoftware/kryo#compatiblefieldserializer-settings,很詳細(xì)的捧杉。
- fst:性能和字節(jié)大小都是最優(yōu)的陕见,可惜兼容性要在字段上加@Version秘血,只能增字段不能刪,對(duì)業(yè)務(wù)開發(fā)侵入太大评甜,如果不考慮兼容的話可以考慮用fst灰粮。參考:https://blog.csdn.net/dutlxq2014/article/details/86698268。wiki:https://github.com/RuedigerMoeller/fast-serialization/wiki
- 需要靜態(tài)編譯的忍坷,如果protobuf, thrift粘舟,適合內(nèi)部系統(tǒng)之間RPC,本文不涉及這部分佩研。
kryo目前的bug
- kryo不要每次都new Kryo()柑肴,這樣性能太差,需要用ThreadLocal或池化存儲(chǔ)kryo實(shí)例旬薯,不過目前發(fā)行版池化有個(gè)bug:https://github.com/EsotericSoftware/kryo/issues/642晰骑,每次池里取不到都會(huì)new一個(gè)出來,在還到池里的時(shí)候袍暴,如果池滿了就會(huì)拋queue full異常些侍。目前kryo池化還有一個(gè)bug,參考:https://github.com/EsotericSoftware/kryo/issues/664政模。只能自己實(shí)現(xiàn)池化岗宣。
- 序列化后如果bean的字段改了類型會(huì)導(dǎo)致jvm crash,雖說字段改類型不應(yīng)該淋样,但是導(dǎo)致jvm crash也是一個(gè)大問題耗式。參考:https://github.com/EsotericSoftware/kryo/issues/663,能反序列化成功就是因?yàn)閺男蛄谢止?jié)里拿到原來的類型趁猴,然后通過unsafe直接寫內(nèi)存刊咳。
性能和字節(jié)大小對(duì)比
SerializeBenchmarkTest3
測試類,對(duì)幾種序列化方式進(jìn)行了測試:
測試數(shù)據(jù):
private Person getPerson() {
Person person = new Person();
person.setId(123L);
person.setName("你好啊");
person.setMarried(true);
person.setAge(22);
person.setDigits(Arrays.asList(1L, 3L, 100L));
Map<String, Double> scoreMap = new LinkedHashMap<>();
scoreMap.put("chinese", 90d);
scoreMap.put("english", 80.5d);
person.setScores(scoreMap);
Book book = new Book();
book.setId(99L);
book.setName("代碼大全");
book.setPrice(56.00d);
person.setBook(book);
int friendsCount = 1000;
List<Person> friends = new ArrayList<>(friendsCount);
for (int i = 0; i < friendsCount; i++) {
Person friend = new Person();
friend.setId(Long.valueOf(i));
friend.setName(String.valueOf("我的朋友" + i));
friend.setMarried(i % 2 == 0 ? true : false);
friends.add(friend);
}
person.setFriends(friends);
return person;
}
test1
方法測試了序列化后自己大小和md5儡司,測試結(jié)果如下:
2019-03-23 15:22:04,916 WARN [main] c.y.o.s.SerializeBenchmarkTest3.jdkPerformance(54) - jdk序列化后長度:52797, 前后長度一致:true, md5一致:true娱挨,對(duì)象equals:true
2019-03-23 15:22:05,226 WARN [main] c.y.o.s.SerializeBenchmarkTest3.jsonPerformance(71) - json序列化后長度:59457, 前后長度一致:true, md5一致:true,對(duì)象equals:true
2019-03-23 15:22:05,309 WARN [main] c.y.o.s.SerializeBenchmarkTest3.hessian2Performance(88) - hessian2序列化后長度:26124, 前后長度一致:false, md5一致:false捕犬,對(duì)象equals:false
2019-03-23 15:22:05,380 WARN [main] c.y.o.s.SerializeBenchmarkTest3.hessianLitePerformance(105) - hessian-lite序列化后長度:26144, 前后長度一致:false, md5一致:false跷坝,對(duì)象equals:true
2019-03-23 15:22:05,661 WARN [main] c.y.o.s.SerializeBenchmarkTest3.kryoPerformance(122) - kryo序列化后長度:28101, 前后長度一致:true, md5一致:true,對(duì)象equals:true
2019-03-23 15:22:05,696 WARN [main] c.y.o.s.SerializeBenchmarkTest3.fstPerformance(139) - fst序列化后長度:33839, 前后長度一致:true, md5一致:true碉碉,對(duì)象equals:true
由于Person類繼承了Human類柴钻,2個(gè)類都有同名屬性id,hessian2在序列化的時(shí)候存在bug導(dǎo)致丟失數(shù)據(jù)垢粮,奇怪的是hessian-lite雖然解決了這個(gè)bug贴届,但是前后序列化字節(jié)長度卻不相等。
從上面結(jié)果可以看出,在小數(shù)據(jù)量場景下毫蚓,hessian2及hessian-lite在體積上占有小優(yōu)勢占键,kryo、fst次之绍些,jdk和json最差捞慌。
然后對(duì)上面的數(shù)據(jù)做10000次序列化和反序列化,結(jié)果如下:
14次YGC
13.658: [GC (Allocation Failure) [PSYoungGen: 682646K->233K(691200K)] 686763K->4350K(2089472K), 0.0018965 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2019-03-23 15:31:22,073 WARN [main] c.y.o.s.SerializeBenchmarkTest3.jdkPerformance(61) - jdk序列化柬批、反序列化10000次耗時(shí)13699
9次YGC
19.258: [GC (Allocation Failure) [PSYoungGen: 687902K->318K(693248K)] 692530K->4946K(2091520K), 0.0006095 secs] [Times: user=0.06 sys=0.00, real=0.00 secs]
2019-03-23 15:31:27,321 WARN [main] c.y.o.s.SerializeBenchmarkTest3.jsonPerformance(78) - json序列化啸澡、反序列化10000次耗時(shí)5245
5次YGC
24.824: [GC (Allocation Failure) [PSYoungGen: 689776K->121K(694784K)] 694539K->4909K(2093056K), 0.0007943 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2019-03-23 15:31:32,671 WARN [main] c.y.o.s.SerializeBenchmarkTest3.hessian2Performance(95) - hessian2序列化、反序列化10000次耗時(shí)5349
11次YGC
35.886: [GC (Allocation Failure) [PSYoungGen: 694368K->64K(696320K)] 699315K->5011K(2094592K), 0.0073285 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
2019-03-23 15:31:43,688 WARN [main] c.y.o.s.SerializeBenchmarkTest3.hessianLitePerformance(112) - hessian-lite序列化氮帐、反序列化10000次耗時(shí)11017
3次YGC
39.634: [GC (Allocation Failure) [PSYoungGen: 694880K->64K(696832K)] 700003K->5203K(2095104K), 0.0007231 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2019-03-23 15:31:48,061 WARN [main] c.y.o.s.SerializeBenchmarkTest3.kryoPerformance(129) - kryo序列化嗅虏、反序列化10000次耗時(shí)4373
3次YGC
43.816: [GC (Allocation Failure) [PSYoungGen: 694945K->193K(696832K)] 700308K->5556K(2095104K), 0.0007227 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2019-03-23 15:31:51,626 WARN [main] c.y.o.s.SerializeBenchmarkTest3.fstPerformance(146) - fst序列化、反序列化10000次耗時(shí)3564
jvm參數(shù):-Xms2g -Xmx2g -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
可以看出fst最快上沐,kryo次之皮服,json、hessian2速度還不錯(cuò)参咙,但是hessian-lite和jdk基本上一樣慢龄广。
最佳實(shí)踐
- 序列化的類最好實(shí)現(xiàn)Serializable接口,并寫死serialVersionUID
- 序列化的類可以加減字段蕴侧,但是最好不要改字段類型
- 如果是開放出去的api择同,最好采用可讀性好、適合web的json净宵,兼容性也好敲才,和語言沒有耦合,就是浪費(fèi)帶寬
- 如果是內(nèi)部RPC择葡,可以采用fst和kryo紧武,或者protobuf, thrift。如果要兼容多版本敏储,fst就不太適合
- 如果有持久化需求阻星,需要考慮到兼容性,可以采用kryo, json
序列化工具類:
static MzKryoPool<Kryo> kryoPool = new MzKryoPool<Kryo>(100);
static FSTConfiguration fst = FSTConfiguration.createDefaultConfiguration();
public static <T> byte[] serializeWithJdk(T object) {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(4096);
ObjectOutputStream objectOutputStream = new ObjectOutputStream(byteArrayOutputStream);
objectOutputStream.writeObject(object);
byte[] bytes = byteArrayOutputStream.toByteArray();
objectOutputStream.close();
return bytes;
} catch (IOException e) {
throw new OperationException("serialize with jdk fail: " + e.getMessage(), e);
}
}
public static Object deserializeWithJdk(byte[] bytes) {
try {
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes);
ObjectInputStream objectInputStream = new ObjectInputStream(byteArrayInputStream);
Object object = objectInputStream.readObject();
objectInputStream.close();
return object;
} catch (ClassNotFoundException | IOException e) {
throw new OperationException("deserialize with jdk fail: " + e.getMessage(), e);
}
}
public static byte[] serializeWithJson(Object object) {
return JSON.toJSONBytes(object);
}
public static <T> T deserializeWithJson(byte[] bytes, Class<T> cls) {
return JSON.parseObject(bytes, cls);
}
public static byte[] serializeWithHessian2(Object object) {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(4096);
Hessian2Output hessianOutput = new Hessian2Output(byteArrayOutputStream);
hessianOutput.startMessage();
hessianOutput.writeObject(object);
hessianOutput.completeMessage();
hessianOutput.close();
return byteArrayOutputStream.toByteArray();
} catch (IOException e) {
throw new OperationException("serialize with hessian2 fail: " + e.getMessage(), e);
}
}
public static Object deserializeWithHessian2(byte[] bytes) {
try {
Hessian2Input hessian2Input = new Hessian2Input(new ByteArrayInputStream(bytes));
hessian2Input.startMessage();
Object o = hessian2Input.readObject();
hessian2Input.completeMessage();
hessian2Input.close();
return o;
} catch (IOException e) {
throw new OperationException("deserialize with hessian2 fail: " + e.getMessage(), e);
}
}
public static byte[] serializeWithHessianLite(Object object) {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(4096);
com.alibaba.com.caucho.hessian.io.Hessian2Output hessian2Output = new com.alibaba.com.caucho.hessian.io.Hessian2Output(byteArrayOutputStream);
hessian2Output.startMessage();
hessian2Output.writeObject(object);
hessian2Output.completeMessage();
hessian2Output.close();
return byteArrayOutputStream.toByteArray();
} catch (IOException e) {
throw new OperationException("serialize with hessian-lite fail: " + e.getMessage(), e);
}
}
public static Object deserializeWithHessianLite(byte[] bytes) {
try {
com.alibaba.com.caucho.hessian.io.Hessian2Input hessian2Input = new com.alibaba.com.caucho.hessian.io.Hessian2Input(new ByteArrayInputStream(bytes));
hessian2Input.startMessage();
Object o = hessian2Input.readObject();
hessian2Input.completeMessage();
hessian2Input.close();
return o;
} catch (IOException e) {
throw new OperationException("deserialize with hessian-lite fail: " + e.getMessage(), e);
}
}
public static byte[] serializeWithKryo(Object obj) {
Kryo kryo = kryoPool.obtain();
//initial 4k, max 10M
try (Output output = new Output(4096, 10 * 1024 * 1024);) {
kryo.writeClassAndObject(output, obj);
return output.toBytes();
} catch (Exception e) {
throw new OperationException("deserialize with kryo fail: " + e.getMessage(), e);
} finally {
kryoPool.free(kryo);
}
}
public static Object deserializeWithKryo(byte[] bytes) {
Kryo kryo = kryoPool.obtain();
try (Input input = new Input(bytes)) {
return kryo.readClassAndObject(input);
} catch (Exception e) {
throw new OperationException("deserialize with kryo fail: " + e.getMessage(), e);
} finally {
kryoPool.free(kryo);
}
}
public static byte[] serializeWithFst(Object obj) {
return fst.asByteArray(obj);
}
public static Object deserializeWithFst(byte[] bytes) {
return fst.asObject(bytes);
}