https://www.zhihu.com/question/362455124?sort=created
微軟的模型壓縮工具:distiller(重點(diǎn))
https://github.com/NervanaSystems/distiller?(支持PyTorch厉斟,官方是在PyTorch1.3上測試的,在GitHub上搜PyTorch Pruning最多星)
微軟的AutoML工具:nni中也有模型壓縮的模塊擦秽,https://github.com/microsoft/nni/blob/master/examples/model_compress/QAT_torch_quantizer.py
https://nni.readthedocs.io/zh/latest/Compressor/Quantizer.html
https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py
Pytorch自帶的量化工具(PyTorch>1.3)
https://zhuanlan.zhihu.com/p/81026071
https://github.com/pytorch/glow/blob/master/docs/Quantization.md
https://github.com/pytorch/QNNPACK
nni中的量化從例子看起來還蠻好用的(Pytorch的官方量化文檔看暈了,不適合剛?cè)胧至炕男“祝?/p>
https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py
nni中有4種量化方式:
1.?Naive Quantizer缩搅,2.?QAT Quantizer触幼,3.?DoReFa Quantizer硼瓣,4.BNN Quantizer
其中置谦,1貌似是最low的,推理的時(shí)候32位變成8位霉祸,不想用,4慢宗,二進(jìn)制神經(jīng)網(wǎng)絡(luò)是啥?按位運(yùn)算挺好的镜沽,不知道部署時(shí)候是否有坑
2,Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)nni中說不支持批量歸一化折疊缅茉,不知道有沒有影響
3,DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients(Face++ 2018)例子看起來簡單https://arxiv.org/abs/1606.06160译打,https://blog.csdn.net/langzi453/article/details/88172080
nni中的量化只是模擬不是加速?https://github.com/microsoft/nni/issues/2332
https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py
paper with code 的量化github排名:
https://paperswithcode.com/search?q_meta=&q=Quantizer
第一名:Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)
https://paperswithcode.com/paper/quantization-and-training-of-neural-networks? ? Tensorflow勸退
第二名:Training with Quantization Noise for Extreme Model Compression(2020)
https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme?PyTorch實(shí)現(xiàn)
模型壓縮的benchmark:?https://paperswithcode.com/task/model-compression
模型壓縮的benchmark:https://paperswithcode.com/task/quantization
DoReFa還有這個(gè)人的:https://github.com/666DZY666/model-compression\
QAT還有這個(gè)人的:https://github.com/Xilinx/brevitas