模型量化 - 簡書

微軟的模型壓縮工具：distiller（重點(diǎn)）

https://github.com/NervanaSystems/distiller?(支持PyTorch厉斟，官方是在PyTorch1.3上測試的，在GitHub上搜PyTorch Pruning最多星)

Pytorch自帶的量化工具（PyTorch>1.3）

nni中的量化從例子看起來還蠻好用的（Pytorch的官方量化文檔看暈了，不適合剛?cè)胧至炕男“祝?/p>

nni中有4種量化方式：

其中置谦，1貌似是最low的，推理的時(shí)候32位變成8位霉祸，不想用，4慢宗，二進(jìn)制神經(jīng)網(wǎng)絡(luò)是啥？按位運(yùn)算挺好的镜沽，不知道部署時(shí)候是否有坑

2，Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference（谷歌2018）nni中說不支持批量歸一化折疊缅茉，不知道有沒有影響

3，DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients（Face++ 2018）例子看起來簡單https://arxiv.org/abs/1606.06160译打，https://blog.csdn.net/langzi453/article/details/88172080

paper with code 的量化github排名：

https://paperswithcode.com/search?q_meta=&q=Quantizer

第一名：Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference（谷歌2018）

第二名：Training with Quantization Noise for Extreme Model Compression（2020）