瑞芯微板子使用探索【RK3588】
1 信息查看
1.1 查看RK系列
root@firefly:~# lspci
0002:20:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd Device 3588 (rev 01)
0002:21:00.0 Network controller: Broadcom Inc. and subsidiaries Device 449d (rev 02
3588硬件信息:
- 內(nèi)嵌的 NPU 支持 INT4/INT8/INT16/FP16 混合運(yùn)算暇榴,算力高達(dá) 6TOP厚棵;
- 支持 8K@60fps 的H.265 和 VP9 解碼器、8k@30fps 的 H.264 解碼器和 4K@60fps 的 AV1 解碼器蔼紧;
- 支持 8K@30fps 的 H.264和 H.265 編碼器婆硬,高質(zhì)量的 JPEG 編碼器/解碼器,專(zhuān)門(mén)的圖像預(yù)處理器和后處理器奸例。
NPU加速:
使用NPU需要下載RKNN SDK彬犯,RKNN SDK 為帶有 NPU 的 RK3588S/RK3588 芯片平臺(tái)提供編程接口,能夠幫助用戶(hù)部署使用 RKNN-Toolkit2 導(dǎo)出的 RKNN 模型查吊,加速 AI 應(yīng)用的落地谐区。
1.2 查看npu的使用情況
cat /sys/kernel/debug/rknpu/load
watch -n 1 cat /sys/kernel/debug/rknpu/load
2 瑞芯微RKNN開(kāi)發(fā)流程
B站視頻教程: https://www.bilibili.com/video/BV1Kj411D78q?p=3&vd_source=9138e2a910cf9bbb083cd42a6750ed10
上圖中包含了RKNN模型開(kāi)發(fā)中用到的所用項(xiàng)目:
① RKNN工具rknn-toolkit2:https://github.com/rockchip-linux/rknn-toolkit2, RKNN-Toolkit-Lite2 provides Python programming interfaces for Rockchip NPU platform (RK3566, RK3568, RK3588, RK3588S) to help users deploy RKNN models and accelerate the implementation of AI applications.
提供了Python接口(RKNN逻卖, X86架構(gòu)下)
模型轉(zhuǎn)換功能宋列,將onnx、torch评也、tensorflow等模型轉(zhuǎn)onnx
模型推理炼杖,支持模擬推理和連板推理兩種模式,在盗迟,若指定了target坤邪,即rknn.init_runtime(target='rk3588'),則是連板推理模式罚缕,需要通過(guò)usb連接到盒子上艇纺;若未指定target,即rknn.init_runtime(target=None)邮弹,默認(rèn)在rknn-toolkit2的模擬器中進(jìn)行推理喂饥;
性能評(píng)估歪玲,提供了模型性能分析函數(shù)盔然,如:accuracy_analysis琅轧、eval_perf钉蒲、eval_memory等接口遗增,可以查看項(xiàng)目倉(cāng)庫(kù)中rknn-toolkit2/doc以及rknn-toolkit2/rknn_toolkit_lite2/docs中的文檔逃贝。
開(kāi)發(fā)需要的文檔基本都在該倉(cāng)庫(kù)中蹦锋,深入開(kāi)發(fā)需仔細(xì)閱讀對(duì)應(yīng)的開(kāi)發(fā)文檔甸私。
目錄rknn-toolkit2/examples/onnx/yolov5中包含了yolov5的轉(zhuǎn)換和推理文檔,不過(guò)Coovally使用的yolov5在后處理階段需要使用sigmoid硝岗,而該文檔中缺失氢哮。
② RKNN工具rknn-toolkit lite 2。這個(gè)工具不是獨(dú)立的型檀,包含在rknn-toolkit2中冗尤,對(duì)應(yīng)目錄為rknn-toolkit2/rknn_toolkit_lite2。提供Python版本的模型推理功能胀溺,只能推理裂七,對(duì)應(yīng)RKNNLite類(lèi)。
③ rknpu2:項(xiàng)目地址 https://github.com/rockchip-linux/rknpu2.git仓坞, 該提供了RKNN 開(kāi)發(fā)的C SDK背零,運(yùn)行在邊端。
④ rknn_server:盒子中的服務(wù)程序无埃,用于接收連板運(yùn)行的名徙瓶。盒子啟動(dòng)后,默認(rèn)啟動(dòng)嫉称。若連板推理時(shí)侦镇,發(fā)現(xiàn)版本不匹配,可以從RKNPU2中復(fù)制對(duì)應(yīng)的動(dòng)態(tài)庫(kù)织阅, 如:rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so虽缕。
3 模型轉(zhuǎn)換
3.1 pth轉(zhuǎn)ONNX
3.1.1 yolov5-det轉(zhuǎn)onnx
yolov5官網(wǎng)提供了export.py腳本,環(huán)境配置好后蒲稳,執(zhí)行該腳本即可。
python3 export.py --weights ../models/yolov5s-det/13978/best.pt --include onnx --opset 12 --simplify
注意:RKNN必須使用opset=12
3.1.2 yolov8-det轉(zhuǎn)onnx
參考:
- https://gitcode.net/magic_ll/yolov8-seg-to-rk3566
- https://blog.csdn.net/weixin_43999691/article/details/133307741
構(gòu)建yolov8的開(kāi)發(fā)環(huán)境伍派,因?yàn)樯婕靶薷膟olov8的部分代碼江耀,所以最好直接將yolov8代碼下載下來(lái),單獨(dú)構(gòu)建一個(gè)yolov8的運(yùn)行環(huán)境诉植;
yolov8官方封裝了YOLO類(lèi)祥国,直接調(diào)用轉(zhuǎn)換即可【當(dāng)前使用的版本是 Ultralytics YOLOv8.0.200】
from ultralytics import YOLO
if __name__ == "__main__":
src_pt_model = "../models/yolov8-det/13981/best.pt"
dst_onnx_model = "../models/yolov8-det/13981/best.onnx"
src_pt_model = "../models/yolov8-seg/13931/best.pt"
dst_onnx_model = "../models/yolov8-seg/13931/best.onnx"
# Load a model
model = YOLO(src_pt_model) # load a custom trained model
# Export the model
model.export(format='onnx')
3.2 ONNX轉(zhuǎn)RKNN
3.2.1 yolov5 ONNX轉(zhuǎn)RKNN【服務(wù)器端操作】
RKNN工具rknn-toolkit2:https://github.com/rockchip-linux/rknn-toolkit2, RKNN-Toolkit-Lite2 provides Python programming interfaces for Rockchip NPU platform (RK3566, RK3568, RK3588, RK3588S) to help users deploy RKNN models and accelerate the implementation of AI applications.
因?yàn)槭褂玫氖荝K3588板子晾腔,所以安裝rknn-toolkit2
pip3 install rknn_toolkit2==1.5.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
直接使用pip3安裝舌稀,提示ERROR: No matching distribution found for rknn_toolkit2==1.5.2,直接從github的release上下載灼擂,并安裝壁查,下載文件約356M,安裝包為whl文件剔应,解壓后可以獲得睡腿。包中僅僅支持36/38/310版本的python语御。
tar zxvf rknn-toolkit2-1.5.2.tar.gz
cd rknn-toolkit2-1.5.2/packages
pip3 install --no-dependencies rknn_toolkit2-1.5.2+b642f30c-cp38-cp38-linux_x86_64.whl -U -i https://pypi.tuna.tsinghua.edu.cn/simple # 否則安裝很多依賴(lài)包且報(bào)錯(cuò),后續(xù)需要時(shí)再重新安裝
pip3 install onnx onnxruntime onnxoptimizer onnxsim ruamel_yaml -i https://pypi.tuna.tsinghua.edu.cn/simple
如果直接在板子中安裝席怪,需要安裝rknn-toolkit-lite 2应闯。在rknn-toolkit2-1.5.2\rknn_toolkit_lite2\packages中找到rknn_toolkit_lite2-1.5.2-cp38-cp38-linux_aarch64.whl安裝即可
參考:
- Python scripts performing object detection using the YOLOv8 model in ONNX:https://github.com/ibaiGorordo/ONNX-YOLOv8-Object-Detection
環(huán)境搭建好之后,調(diào)用RKNN的API轉(zhuǎn)換即可挂捻。下面的代碼轉(zhuǎn)換為yolov5 onnx轉(zhuǎn)rknn示例碉纺。RKNN模型默認(rèn)精度是fp16。
import os
from rknn.api import RKNN
if __name__ == '__main__':
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
src_onnx_model = "models/yolov5s-det/13978/best.onnx"
dst_rknn_model = "models/yolov5s-det/13978/best_origin_3_output.rknn"
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(
mean_values=[[0, 0, 0]],
std_values=[[255, 255, 255]],
target_platform='rk3588',
quantized_method='channel', # layer
optimization_level=1 # 0 1 2 3
)
print('done')
# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=["/model.24/m.0/Conv_output_0", "/model.24/m.1/Conv_output_0", "/model.24/m.2/Conv_output_0"])
if ret != 0:
print('Load yolov5 failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(
do_quantization=False,
dataset=dataset,
rknn_batch_size=1
)
if ret != 0:
print('Build yolov5 failed!')
exit(ret)
print('done')
# Export RKNN model
print('--> Export RKNN model')
ret = rknn.export_rknn(dst_rknn_model)
if ret != 0:
print('Export yolov5rknn failed!')
exit(ret)
print('done')
ret = rknn.accuracy_analysis(inputs=[img_path])
if ret != 0:
print('Accuracy analysis failed!')
print(ret)
print('done')
rknn.release()
- dataset.txt文本文件存放的就是用于量化的圖片路徑刻撒,如下:
/opt/data/code/yolov5-research/data/quantify_data/0/fa4f0905d4f1f68c07f2e0f3fc26ab39.jpg
/opt/data/code/yolov5-research/data/quantify_data/0/9dc8c98f67484b5fa921ee392fc68bc3.jpg
/opt/data/code/yolov5-research/data/quantify_data/0/8a047ffcc0c61a15632a54c73fe7f70e.jpg
/opt/data/code/yolov5-research/data/quantify_data/0/02f66416341f3224cb0123508ab98f1a.jpg
/opt/data/code/yolov5-research/data/quantify_data/0/eb6ac6aca0b14e9f766c7122d0d81fd1.jpg
/opt/data/code/yolov5-research/data/quantify_data/0/e4d231eb6464cd0f103bf72a06806264.jpg
3.2.2 yolov8 ONNX轉(zhuǎn)RKNN【服務(wù)器端操作】
yolov8-det轉(zhuǎn)rknn流程同yolov5骨田。
4 模型量化
4.1 yolov5-det量化
4.1.1 普通量化
yolov5普通量化的示例代碼如下,
import os
from rknn.api import RKNN
if __name__ == '__main__':
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
src_onnx_model = "models/yolov5s-det/13978/best.onnx"
dst_rknn_model = "models/yolov5s-det/13978/best_int8_3_output.rknn"
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(
mean_values=[[0, 0, 0]],
std_values=[[255, 255, 255]],
target_platform='rk3588',
quantized_dtype="asymmetric_quantized-8",
# quantized_algorithm="mmse", # normal
quantized_method='channel', # layer
optimization_level=1 # 0 1 2 3
)
print('done')
# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=["/model.24/m.0/Conv_output_0", "/model.24/m.1/Conv_output_0", "/model.24/m.2/Conv_output_0"])
if ret != 0:
print('Load yolov5 failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(do_quantization=True, dataset=dataset, rknn_batch_size=1)
if ret != 0:
print('Build yolov5 failed!')
exit(ret)
print('done')
# Export RKNN model
print(f'--> Export RKNN model to {dst_rknn_model}')
# ret = rknn.export_rknn(dst_rknn_model, cpp_gen_cfg=True, target='rk3588')
ret = rknn.export_rknn(dst_rknn_model)
if ret != 0:
print('Export yolov5rknn failed!')
exit(ret)
print('done')
ret = rknn.accuracy_analysis(inputs=[img_path], target='rk3588')
if ret != 0:
print('Accuracy analysis failed!')
print(ret)
print('done')
rknn.release()
關(guān)鍵點(diǎn):
- rknn.config中設(shè)置量化的相關(guān)參數(shù)疫赎,可以參考《Rockchip_User_Guide_RKNN_Toolkit2_CN-1.5.2.pdf》
- rknn.load_onnx盛撑,這里要指定outputs,用于截掉網(wǎng)絡(luò)中無(wú)法使用sigmoid的層捧搞。如下圖所示抵卫,紅色框去除,解碼綠色框部分胎撇。
- rknn.build介粘,編譯模型,啟用量化晚树,設(shè)置batch_size姻采。
- rknn.export_rknn,導(dǎo)出rknn模型爵憎。
- rknn.accuracy_analysis慨亲,對(duì)模型進(jìn)行性能分析。
4.1.2 混合量化
混合量化就是對(duì)模型中進(jìn)行int8量化之后精度損失較大的部分宝鼓,再修改為fp16刑棵,不使用int8量化。包含兩個(gè)步驟愚铡,具體可參考文檔《Rockchip_User_Guide_RKNN_Toolkit2_CN-1.5.2.pdf》蛉签。
步驟1:hybrid_quantization_step1,yolov5-det示例代碼
import os
from rknn.api import RKNN
if __name__ == "__main__":
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
src_onnx_model = "models/yolov5s-det/13978/best.onnx"
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(
mean_values=[[0, 0, 0]],
std_values=[[255, 255, 255]],
target_platform='rk3588'
)
print('done')
# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=["/model.24/m.0/Conv_output_0", "/model.24/m.1/Conv_output_0", "/model.24/m.2/Conv_output_0"])
if ret != 0:
print('Load yolov5 failed!')
exit(ret)
print('done')
# Build model
rknn.hybrid_quantization_step1(
dataset=dataset,
rknn_batch_size=1,
proposal=True,
proposal_dataset_size=1
)
rknn.release()
步驟2:hybrid_quantization_step2沥寥,使用步驟1的輸出碍舍,作為步驟2 的輸入。步驟1生成的best.quantization.cfg用于說(shuō)明需要修回fp16的節(jié)點(diǎn)邑雅。若第一步驟中的proposal=False片橡,這里必須手動(dòng)修改。若proposal=True,則自動(dòng)生成需要修改的節(jié)點(diǎn)淮野。yolov5-det示例代碼锻全。
from rknn.api import RKNN
if __name__ == "__main__":
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
dst_rknn_model = "models/yolov5s-det/13978/best_hybrid8_3_output.rknn"
# Create RKNN object
rknn = RKNN()
# hybrid_quantization_step2, 輸入均為第1步輸出的文件
rknn.hybrid_quantization_step2(
model_input="best.model",
data_input="best.data",
model_quantization_cfg="best.quantization.cfg"
)
# Export RKNN model
print('--> Export RKNN model')
ret = rknn.export_rknn(dst_rknn_model, target='rk3588')
if ret != 0:
print('Export yolov5rknn failed!')
exit(ret)
print('done')
rknn.accuracy_analysis(
inputs=[img_path],
target='rk3588',
)
rknn.release()
4.1.3 量化前后對(duì)比
注意:rknn模型也可以通過(guò)Netron進(jìn)行查看狂塘,不過(guò)要使用較新的版本,下面是yolov5未量化鳄厌、int8量化荞胡、混合精度的對(duì)比。
4.1.4 量化過(guò)程遇到的問(wèn)題
問(wèn)題1:為何yolov5使用int8量化之后了嚎,推理速度反而變慢了泪漂?
量化前后,輸出的維度不一樣了歪泳,batch_size 變成了32萝勤。這個(gè)是因?yàn)槲以诜?wù)器上量化時(shí),把batch size設(shè)置為了32呐伞,把batch_size設(shè)置為1即可敌卓。
ret = rknn.build(do_quantization=True, dataset=dataset, rknn_batch_size=1)
問(wèn)題2:按照yolov5的前處理進(jìn)行時(shí),推理結(jié)果異常
由于rknn默認(rèn)使用的是data_format="nhwc"伶氢,而正常yolov5使用的順序是data_format="nchw"趟径,因此可以在前處理部分修改,也可以在inference調(diào)用是顯示指定data_format="nchw"癣防。但是在rk3588上使用data_format="nchw"時(shí)蜗巧,提示不支持nchw,所以建議還是使用默認(rèn)的“nhwc”順序蕾盯。
rk_result = self._rk_session.inference(inputs=[img], data_format="nhwc")
問(wèn)題3:模型量化之后幕屹,后處理該如何做?
直接按照未量化前的數(shù)據(jù)格式輸入模式级遭,int8量化模型推理的置信度會(huì)全為0望拖;
參考:
- https://blog.csdn.net/wave789/article/details/132446886
- https://github.com/airockchip/rknn_model_zoo
- rknn的后處理代碼分析: https://zhuanlan.zhihu.com/p/648638273
- RKNPU2 從入門(mén)到實(shí)踐(基于RK3588和RK3568): https://www.bilibili.com/video/BV1Kj411D78q/?p=5&vd_source=9138e2a910cf9bbb083cd42a6750ed10
- https://github.com/wangqiqi/yolov5/tree/rknn_dev
- https://github.com/airockchip/yolov5/tree/master
- https://www.51openlab.com/article/257/
解決方法:yolov5 3個(gè)卷積之后有sigmoid操作,如果這個(gè)時(shí)候使用int8量化挫鸽,則輸出的結(jié)果都是0说敏。解決方法加載onnx模型時(shí)提取卷積之后的結(jié)果,然后手動(dòng)寫(xiě)后處理后處理操作掠兄。
# Load ONNX model
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=["/model.24/m.0/Conv_output_0", "/model.24/m.1/Conv_output_0", "/model.24/m.2/Conv_output_0"])
問(wèn)題4:yolov5 seg直接取output0和output1的輸出進(jìn)行后處理之后,可以獲得正確的結(jié)果锌雀,但是取/model.22/Mul_2_output_0蚂夕、/model.22/Split_output_1、/model.22/Concat_output_0腋逆、/model.22/proto/cv3/conv/Conv_output_0推理的結(jié)果存在截?cái)嗟那闆r婿牍?
<img src="https://upload-images.jianshu.io/upload_images/1700062-cc11052edfe858e1.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="輪廓被截?cái)? style="zoom:50%;" />
原因分析,繪制出bbox惩歉,發(fā)現(xiàn)在后處理過(guò)程中等脂,bbox縮放錯(cuò)誤俏蛮,導(dǎo)致了分割區(qū)域出現(xiàn)被切分的現(xiàn)象。
問(wèn)題5:yolov5 seg直接轉(zhuǎn)rknn上遥,使用默認(rèn)的2個(gè)輸出搏屑,不進(jìn)行量化,但是推理仍然無(wú)結(jié)果
原因分析:yolov5 seg訓(xùn)練時(shí)使用的參數(shù)是768x768粉楚,而使用export.py轉(zhuǎn)onnx時(shí)辣恋,默認(rèn)是640x640,雖然可以直接轉(zhuǎn)rknn模型模软,但是推理無(wú)結(jié)果伟骨。解決方法是在導(dǎo)出腳本上添加模型的輸入尺寸。
python3 export.py --weights ../models/yolov5s-seg/13933/best.pt --include onnx --opset 12 --simplify --img-size 768 768
4.2 yolov8-det量化
4.2.1 普通量化
不修改模型燃异,導(dǎo)出默認(rèn)的輸出携狭,由于sigmoid函數(shù)的影響,導(dǎo)致輸出無(wú)結(jié)果回俐,因此量化后保證模型可以使用逛腿,必須導(dǎo)出2個(gè)節(jié)點(diǎn),或者6個(gè)節(jié)點(diǎn)鲫剿。
導(dǎo)出兩個(gè)節(jié)點(diǎn)的RKNN模型(無(wú)需修改模型代碼直接導(dǎo)出即可):
import os
from rknn.api import RKNN
if __name__ == '__main__':
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
src_onnx_model = "models/yolov8s-det/13980/best_normal.onnx"
dst_rknn_model = "models/yolov8s-det/13980/best_2_output_int8.rknn"
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(
mean_values=[[0, 0, 0]],
std_values=[[255, 255, 255]],
target_platform='rk3588',
quantized_dtype="asymmetric_quantized-8",
quantized_method='channel', # layer
optimization_level=1 # 0 1 2 3
)
print('done')
# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=['/model.22/Mul_2_output_0', '/model.22/Split_output_1']
)
if ret != 0:
print('Load yolov8 failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(do_quantization=True, dataset=dataset, rknn_batch_size=1)
if ret != 0:
print('Build yolov8 failed!')
exit(ret)
print('done')
# Export RKNN model
print(f'--> Export RKNN model {dst_rknn_model}')
ret = rknn.export_rknn(dst_rknn_model)
if ret != 0:
print('Export yolov8rknn failed!')
exit(ret)
print('done')
ret = rknn.accuracy_analysis(inputs=[img_path])
if ret != 0:
print('Accuracy analysis failed!')
print(ret)
print('done')
導(dǎo)出6個(gè)輸出的模型鳄逾,需要先修改模型代碼,再導(dǎo)出onnx灵莲,再轉(zhuǎn)換雕凹。
修改1:ultralytics/nn/modules/head.py,在class Detect新增“導(dǎo)出onnx增加”下的代碼政冻。同時(shí)把forward中的函數(shù)替換成下面的forward枚抵。
class Detect(nn.Module):
"""YOLOv8 Detect head for detection models."""
dynamic = False # force grid reconstruction
export = False # export mode
shape = None
anchors = torch.empty(0) # init
strides = torch.empty(0) # init
# 導(dǎo)出onnx增加
conv1x1 = nn.Conv2d(16, 1, 1, bias=False).requires_grad_(False)
x = torch.arange(16, dtype=torch.float)
conv1x1.weight.data[:] = nn.Parameter(x.view(1, 16, 1, 1))
def forward(self, x):
y = []
for i in range(self.nl):
t1 = self.cv2[i](x[i])
t2 = self.cv3[i](x[i])
y.append(self.conv1x1(t1.view(t1.shape[0], 4, 16, -1).transpose(2, 1).softmax(1)))
# y.append(t2.sigmoid())
y.append(t2)
return y
修改2:ultralytics/engine/exporter.py。在函數(shù)export_onnx()將output_names替換掉明场,如下:
# output_names = ['output0', 'output1'] if isinstance(self.model, SegmentationModel) else ['output0']
output_names = ['reg1', 'cls1', 'reg2', 'cls2', 'reg3', 'cls3']
導(dǎo)出onnx
import os
from ultralytics import YOLO
if __name__ == "__main__":
input_height, input_width = 640, 640
src_pt_model = "models/yolov8s-det/13980/best.pt"
# Load a model
model = YOLO(src_pt_model) # load a custom trained model
# Export the model
model.export(
format='onnx',
imgsz=[input_height, input_width],
opset=12,
verbose=True,
simplify=True
)
src_onnx_model = "models/yolov8s-det/13980/best.onnx"
dst_onnx_model = "models/yolov8s-det/13980/best_6_output.onnx"
os.rename(src_onnx_model, dst_onnx_model)
print(f"rename {src_onnx_model} to {dst_onnx_model}")
轉(zhuǎn)換為6個(gè)輸出的RKNN模型
import os
from rknn.api import RKNN
if __name__ == '__main__':
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
src_onnx_model = "models/yolov8s-det/13980/best_6_output.onnx"
dst_rknn_model = "models/yolov8s-det/13980/best_6_output_int8.rknn"
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]], target_platform='rk3588')
print('done')
# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=['reg1', 'cls1', 'reg2', 'cls2', 'reg3', 'cls3']
)
if ret != 0:
print('Load yolov8 failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(
do_quantization=True,
dataset=dataset,
rknn_batch_size = 1
)
if ret != 0:
print('Build yolov8 failed!')
exit(ret)
print('done')
# Export RKNN model
print(f'--> Export RKNN model to {dst_rknn_model}')
ret = rknn.export_rknn(dst_rknn_model)
if ret != 0:
print('Export yolov8rknn failed!')
exit(ret)
print('done')
ret = rknn.accuracy_analysis(inputs=[img_path], target="rk3588")
if ret != 0:
print('Accuracy analysis failed!')
print(ret)
print('done')
4.2.2 混合精度量化
混合量化步驟1:
import os
import shutil
from rknn.api import RKNN
if __name__ == "__main__":
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
output_node_count = 2
if output_node_count == 2:
src_onnx_model = "models/yolov8s-det/13980/best_normal.onnx"
tmp_model_dir = "models/yolov8s-det/13980/best_2_output_hybrid"
os.makedirs(tmp_model_dir, exist_ok=True)
elif output_node_count == 6:
# 需要修改源碼汽摹,實(shí)現(xiàn)best_6_output.onnx的導(dǎo)出
src_onnx_model = "models/yolov8s-det/13980/best_6_output.onnx"
tmp_model_dir = "models/yolov8s-det/13980/best_6_output_hybrid"
os.makedirs(tmp_model_dir, exist_ok=True)
# Create RKNN object
rknn = RKNN()
if not os.path.exists(src_onnx_model):
print('model not exist')
exit(-1)
# pre-process config
print('--> Config model')
rknn.config(
mean_values=[[0, 0, 0]],
std_values=[[255, 255, 255]],
target_platform='rk3588'
)
print('done')
# Load ONNX model
print('--> Loading model')
if output_node_count == 2:
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=['/model.22/Mul_2_output_0', '/model.22/Split_output_1']
)
if ret != 0:
print('Load yolov8 failed!')
exit(ret)
elif output_node_count == 6:
ret = rknn.load_onnx(
model=src_onnx_model,
outputs=['reg1', 'cls1', 'reg2', 'cls2', 'reg3', 'cls3']
)
if ret != 0:
print('Load yolov8 failed!')
exit(ret)
print('done')
else:
raise Exception(f"invalid output_node_count = {output_node_count}")
print(f'output_node_count = {output_node_count}, done')
# Build model
rknn.hybrid_quantization_step1(
dataset=dataset,
rknn_batch_size=1,
proposal=True,
proposal_dataset_size=1
)
rknn.release()
# move file
file_prefix = os.path.basename(src_onnx_model)[0:-5]
shutil.move(file_prefix + ".model", os.path.join(tmp_model_dir, file_prefix + ".model"))
shutil.move(file_prefix + ".data", os.path.join(tmp_model_dir, file_prefix + ".data"))
shutil.move(file_prefix + ".quantization.cfg", os.path.join(tmp_model_dir, file_prefix + ".quantization.cfg"))
print(f"copy files to {tmp_model_dir}")
混合量化步驟2:
import os
from rknn.api import RKNN
if __name__ == "__main__":
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
output_node_count = 2
if output_node_count == 2:
src_onnx_model = "models/yolov8s-det/13980/best_normal.onnx"
tmp_model_dir = "models/yolov8s-det/13980/best_2_output_hybrid"
dst_rknn_model = "models/yolov8s-det/13980/best_2_output_hybrid_custom.rknn"
os.makedirs(tmp_model_dir, exist_ok=True)
elif output_node_count == 6:
# 需要修改源碼,實(shí)現(xiàn)best_6_output.onnx的導(dǎo)出
src_onnx_model = "models/yolov8s-det/13980/best_6_output.onnx"
tmp_model_dir = "models/yolov8s-det/13980/best_6_output_hybrid"
dst_rknn_model = "models/yolov8s-det/13980/best_6_output_hybrid.rknn"
os.makedirs(tmp_model_dir, exist_ok=True)
# Create RKNN object
rknn = RKNN()
# hybrid_quantization_step2, 輸入均為第1步輸出的文件
file_prefix = os.path.basename(src_onnx_model)[0:-5]
rknn.hybrid_quantization_step2(
model_input=os.path.join(tmp_model_dir, file_prefix + ".model"),
data_input=os.path.join(tmp_model_dir, file_prefix + ".data"),
model_quantization_cfg= os.path.join(tmp_model_dir, file_prefix + ".quantization_custom.cfg")
)
# Export RKNN model
print('--> Export RKNN model')
ret = rknn.export_rknn(dst_rknn_model, target='rk3588')
if ret != 0:
print('Export yolov8rknn failed!')
exit(ret)
print('done')
rknn.accuracy_analysis(
inputs=[img_path],
target='rk3588',
)
rknn.release()
4.2.3 量化過(guò)程中遇到的問(wèn)題
問(wèn)題1:yolov8導(dǎo)出onnx后苦锨,量化為int8之后逼泣,為什么置信度量化后全為0?
因?yàn)閟igmoid的值域(0,1)舟舒,int8量化后就為0了拉庶。所以去掉sigmoid。導(dǎo)出下面紅色框中的兩層即可秃励,'/model.22/Mul_2_output_0', '/model.22/Split_output_1'氏仗。
參考:
- yolov8-det轉(zhuǎn)換:https://blog.csdn.net/wave789/article/details/132446886
- yolov8n_official_onnx_tensorRT_rknn_horizon:https://github.com/cqu20160901/yolov8n_official_onnx_tensorRT_rknn_horizon
- 瑞芯微rk3566部署yolov8實(shí)踐: https://zhuanlan.zhihu.com/p/648031088
- yolov8 瑞芯微RKNN和地平線Horizon芯片仿真測(cè)試部署: https://blog.csdn.net/zhangqian_1/article/details/128918268
問(wèn)題2:yolov8混合量化時(shí)報(bào)錯(cuò)?
E hybrid_quantization_step1: Catch exception when building RKNN model!
E hybrid_quantization_step1: Traceback (most recent call last):
E hybrid_quantization_step1: File "rknn/api/rknn_base.py", line 2109, in rknn.api.rknn_base.RKNNBase.hybrid_quantization_step1
E hybrid_quantization_step1: File "rknn/api/quantizer.py", line 265, in rknn.api.quantizer.Quantizer.save_hybrid_cfg
E hybrid_quantization_step1: File "rknn/api/quantizer.py", line 268, in rknn.api.quantizer.Quantizer.save_hybrid_cfg
E hybrid_quantization_step1: File "/opt/data/virtualenvs/yolov8/lib/python3.8/site-packages/ruamel/yaml/main.py", line 1229, in dump
E hybrid_quantization_step1: error_deprecation('dump', 'dump', arg="typ='unsafe', pure=True")
E hybrid_quantization_step1: File "/opt/data/virtualenvs/yolov8/lib/python3.8/site-packages/ruamel/yaml/main.py", line 1017, in error_deprecation
E hybrid_quantization_step1: sys.exit(1)
E hybrid_quantization_step1: SystemExit: 1
參考:https://github.com/laitathei/YOLOv8-ONNX-RKNN-HORIZON-TensorRT-Segmentation/tree/master
ruamel_yaml版本的問(wèn)題夺鲜,默認(rèn)安裝的版本過(guò)高皆尔,導(dǎo)致執(zhí)行失敗呐舔,重新安裝ruamel_yaml即可。
pip3 install ruamel_yaml==0.17.40 -i https://pypi.tuna.tsinghua.edu.cn/simple
問(wèn)題3:yolov8 int8或者混合精度量化之后慷蠕,無(wú)論使用1個(gè)輸出珊拼、2個(gè)輸出、6個(gè)輸出砌们,都無(wú)法檢測(cè)結(jié)果杆麸,而fp16正常推理?
原因分析:出現(xiàn)該問(wèn)題主要在預(yù)處理上浪感,yolov8的預(yù)處理默認(rèn)進(jìn)行了歸一化即除掉了255昔头,導(dǎo)出時(shí)使用mean_values=[[0, 0, 0]], std_values=[[1, 1, 1]]若使用fp16時(shí)是沒(méi)問(wèn)題的,但int8時(shí)影兽,先除255歸一化之后揭斧,輸入的值可能已經(jīng)是空了,所以很難推理出結(jié)果峻堰。
解決方法讹开,導(dǎo)出時(shí)使用mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]],預(yù)處理中把除255去掉捐名。說(shuō)明旦万,量化之后在前處理時(shí),就不能再Normalize镶蹋。
# image_data = np.array(img) / 255.0
問(wèn)題4:yolov8 int8量化之后成艘,2個(gè)輸出時(shí),檢測(cè)框會(huì)發(fā)生偏移贺归,6個(gè)輸出時(shí)淆两,檢測(cè)框偏大,但不偏移拂酣。
原因分析:yolov8 2輸出時(shí)秋冰, int8、fp16婶熬、hybrid轉(zhuǎn)換之后剑勾,使用同一套前后處理代碼,int8赵颅、hybrid之后的模型均存在檢測(cè)框偏移的情況虽另,從而可以推斷是模型推理的問(wèn)題。6個(gè)輸出之所以不存在檢測(cè)結(jié)果偏移的問(wèn)題性含,是因?yàn)?個(gè)輸出后都是基于fp32推理的洲赵≡Ч撸基于此商蕴,如果使用混合精度量化叠萍,模仿6輸出,將6輸出后的層全部設(shè)置為float16绪商,應(yīng)該就可以解決檢測(cè)框偏移的問(wèn)題苛谷。在best_6_output.quantization.cfg文件中添加一下內(nèi)容,在生成混合精度的量化時(shí)格郁,該問(wèn)題解決腹殿。
custom_quantize_layers:
/model.22/cv2.2/cv2.2.0/conv/Conv_output_0: float16
/model.22/cv2.2/cv2.2.0/act/Mul_output_0: float16
/model.22/cv2.2/cv2.2.1/conv/Conv_output_0: float16
/model.22/cv2.2/cv2.2.1/act/Mul_output_0: float16
/model.22/cv2.2/cv2.2.2/Conv_output_0: float16
/model.22/cv3.2/cv3.2.0/conv/Conv_output_0: float16
/model.22/cv3.2/cv3.2.0/act/Mul_output_0: float16
/model.22/cv3.2/cv3.2.1/conv/Conv_output_0: float16
/model.22/cv3.2/cv3.2.1/act/Mul_output_0: float16
/model.22/cv3.2/cv3.2.2/Conv_output_0: float16
/model.22/Concat_2_output_0: float16
/model.22/Reshape_2_output_0_shape4_/model.22/Concat_3: float16
/model.22/cv2.1/cv2.1.0/conv/Conv_output_0: float16
/model.22/cv2.1/cv2.1.0/act/Mul_output_0: float16
/model.22/cv2.1/cv2.1.1/conv/Conv_output_0: float16
/model.22/cv2.1/cv2.1.1/act/Mul_output_0: float16
/model.22/cv2.1/cv2.1.2/Conv_output_0: float16
/model.22/cv3.1/cv3.1.0/conv/Conv_output_0: float16
/model.22/cv3.1/cv3.1.0/act/Mul_output_0: float16
/model.22/cv3.1/cv3.1.1/conv/Conv_output_0: float16
/model.22/cv3.1/cv3.1.1/act/Mul_output_0: float16
/model.22/cv3.1/cv3.1.2/Conv_output_0: float16
/model.22/Concat_1_output_0: float16
/model.22/Reshape_1_output_0_shape4_/model.22/Concat_3: float16
/model.22/cv2.0/cv2.0.0/conv/Conv_output_0: float16
/model.22/cv2.0/cv2.0.0/act/Mul_output_0: float16
/model.22/cv2.0/cv2.0.1/conv/Conv_output_0: float16
/model.22/cv2.0/cv2.0.1/act/Mul_output_0: float16
/model.22/cv2.0/cv2.0.2/Conv_output_0: float16
/model.22/cv3.0/cv3.0.0/conv/Conv_output_0: float16
/model.22/cv3.0/cv3.0.0/act/Mul_output_0: float16
/model.22/cv3.0/cv3.0.1/conv/Conv_output_0: float16
/model.22/cv3.0/cv3.0.1/act/Mul_output_0: float16
/model.22/cv3.0/cv3.0.2/Conv_output_0: float16
/model.22/Concat_output_0: float16
/model.22/Reshape_output_0_shape4_/model.22/Concat_3: float16
/model.22/Concat_3_output_0: float16
/model.22/Split_output_0_shape4: float16
/model.22/Split_output_1_shape4: float16
/model.22/dfl/Reshape_output_0: float16
/model.22/dfl/Softmax_pre_tp: float16
/model.22/dfl/Softmax_new: float16
/model.22/dfl/conv/Conv_output_0: float16
/model.22/dfl/Reshape_1_output_0_shape4_/model.22/Slice_2split: float16
/model.22/dfl/Reshape_1_output_0_shape4_/model.22/Slice_2split_conv_/model.22/Slice_2split: float16
/model.22/Slice_output_0_shape4_before_conv: float16
/model.22/Slice_1_output_0: float16
/model.22/Slice_output_0: float16
/model.22/Sub_output_0: float16
/model.22/Add_1_output_0: float16
/model.22/Sub_1_output_0: float16
/model.22/Concat_4_swap_concat_reshape_i1_out: float16
/model.22/Add_2_output_0: float16
/model.22/Div_1_output_0: float16
/model.22/Concat_4_swap_concat_reshape_i0_out: float16
/model.22/Concat_4_output_0_shape4: float16
/model.22/Mul_2_output_0_shape4_before: float16
/model.22/Mul_2_output_0: float16
/model.22/Split_output_1: float16
說(shuō)明量化之后模型的檢測(cè)性能存在不穩(wěn)定性,需要仔細(xì)分析例书,慎重量化锣尉。
4.2.4 yolov8-det量化總結(jié)
主要結(jié)論如下:
- 默認(rèn)輸出1個(gè)output,output0决采,int8量化之后無(wú)推理結(jié)果自沧,無(wú)需修改官方源碼;
- 2個(gè)output树瞭,'/model.22/Mul_2_output_0', '/model.22/Split_output_1'拇厢,int8量化之后無(wú)推理結(jié)果,無(wú)需修改官方源碼晒喷;
- 6個(gè)output孝偎,'reg1', 'cls1', 'reg2', 'cls2', 'reg3', 'cls3',int8量化之后有推理結(jié)果凉敲,需要修改官方源碼衣盾;
- 導(dǎo)出rknn時(shí),mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]]使用這兩個(gè)值荡陷,前處理不進(jìn)行歸一化雨效,否則量化之后無(wú)法推理;
- yolov8導(dǎo)出onnx前需要修改源代碼废赞,int8僅在導(dǎo)出2/6個(gè)輸出的版本時(shí)可以正確推理徽龟;
- 量化之后,輸出節(jié)點(diǎn)為2個(gè)時(shí)唉地,推理結(jié)果的檢測(cè)框存在向左上偏移的現(xiàn)象据悔;
- 量化之后,輸出節(jié)點(diǎn)為6個(gè)時(shí)耘沼,推理結(jié)果的檢測(cè)框偏大极颓,但不會(huì)發(fā)生偏移;
- 未量化之前群嗤,輸出節(jié)點(diǎn)為6個(gè)或者2時(shí)菠隆,推理結(jié)果的檢測(cè)框均正常;
- 建議先使用輸出節(jié)點(diǎn)為6的量化模型,或者對(duì)2個(gè)輸出節(jié)點(diǎn)的模型骇径,定制化量化內(nèi)容躯肌,再量化為混合精度。
4.3 yolov8-seg量化
yolov8默認(rèn)輸出兩個(gè)節(jié)點(diǎn)破衔,分別是output0和output1清女,output0對(duì)應(yīng)的是目標(biāo)檢測(cè)結(jié)果和部分分割結(jié)果,output1是部分分割結(jié)果晰筛。因此可以從output0中提取/model.22/Mul_2_output_0嫡丙、/model.22/Split_output_1、/model.22/Concat_output_0读第,從output1的上一層提取/model.22/proto/cv3/conv/Conv_output_0曙博,構(gòu)成四個(gè)節(jié)點(diǎn)的輸出,防止量化后怜瞒,因sigmoid操作導(dǎo)致無(wú)檢測(cè)結(jié)果羊瘩。
- 輸出兩節(jié)點(diǎn):'output0', 'output1', 量化之后無(wú)推理結(jié)果盼砍。
- 輸出四節(jié)點(diǎn):'/model.22/Mul_2_output_0', '/model.22/Split_output_1', '/model.22/Concat_output_0','/model.22/proto/cv3/conv/Conv_output_0尘吗,量化之后可以推理。
5 模型部署
5.1 在板子上部署
參考:yolov5篇---yolov5訓(xùn)練pt模型并轉(zhuǎn)換為rknn模型浇坐,部署在RK3588開(kāi)發(fā)板上——從訓(xùn)練到部署全過(guò)程 https://blog.csdn.net/m0_46825740/article/details/128818516
下載rknpu2:https://github.com/rockchip-linux/rknpu2
下面的命令是在邊端設(shè)備上使用官方提供的代碼睬捶,編譯完成后,進(jìn)行推理的示例近刘。
git clone https://github.com/rockchip-linux/rknpu2.git
- examples/rknn_yolov5_demo => #define OBJ_CLASS_NUM 3 # 修改為對(duì)應(yīng)類(lèi)別數(shù)
- coco_80_labels_list.txt # 修改對(duì)應(yīng)的標(biāo)簽
./build-linux_RK3588.sh
./rknn_yolov5_demo ./model/RK3588/yolov5s-640-640.rknn ./model/bus.jpg
./rknn_yolov5_video_demo ./model/RK3588/yolov5s-640-640.rknn 28ab8ba8a51a8e46eabc58575b6c208e.mp4 264
./rknn_yolov5_video_demo ./model/RK3588/yolov5s-640-640.rknn vlc-record-2023-09-22-11h14m49s.mp4 264
5.2 問(wèn)題
問(wèn)題1:固件中rknn_server版本較低
解決方法:更新rknn_server【開(kāi)發(fā)模式】
cp /root/rknpu2/runtime/RK3588/Linux/rknn_server/aarch64/usr/bin/restart_rknn.sh /usr/bin
cp /root/rknpu2/runtime/RK3588/Linux/rknn_server/aarch64/usr/bin/start_rknn.sh /usr/bin
cp /root/rknpu2/runtime/RK3588/Linux/rknn_server/aarch64/usr/bin/rknn_server /usr/bin
問(wèn)題2:復(fù)制更新rknn_server后擒贸,執(zhí)行提示庫(kù)的版本不對(duì)
root@firefly:/usr/bin# 1090443 RKNN SERVER loadRuntime(110): dlsym rknn_set_input_shapes failed: /lib/librknnrt.so: undefined symbol: rknn_set_input_shapes, reuqired librknnrt.so >= 1.5.2!
E RKNN: [07:43:34.070] 6, 4
E RKNN: [07:43:34.070] Invalid RKNN model version 6
E RKNN: [07:43:34.070] rknn_init, load model failed!
1090444 RKNN SERVER init(183): rknn_init fail! ret=-6
1090444 RKNN SERVER process_msg_init(381): Client 1 init model fail!
解決方法:復(fù)制對(duì)應(yīng)的庫(kù),然后重啟rknn_server即可
cp /root/rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /lib/
restart_rknn.sh
問(wèn)題3:E RKNN: [12:54:27.331] Mismatch driver version, librknnrt version: 1.5.2 (c6b7b351a@2023-08-23T15:28:22) requires driver version >= 0.7.0, but you have driver version: 0.6.4 which is incompatible!
原因及解決方法:出現(xiàn)這個(gè)錯(cuò)誤觉渴,說(shuō)明板子的固件版本較低介劫,在官網(wǎng)下載對(duì)應(yīng)的固件,并重新燒寫(xiě)即可案淋。
6 模型推理
6.1 使用python rknn推理【服務(wù)器端】
加載轉(zhuǎn)換生成的rknn模型進(jìn)行推理座韵。這里有個(gè)關(guān)鍵點(diǎn),即rknn.init_runtime(target='rk3588', core_mask=RKNN.NPU_CORE_AUTO)踢京,若target設(shè)置為對(duì)應(yīng)的目標(biāo)設(shè)備型號(hào)誉碴,則進(jìn)行連板推理,也就是通過(guò)rknn_server在邊緣設(shè)備上推理瓣距。若target設(shè)置為None黔帕,則在模擬器上進(jìn)行推理。
import os
import cv2
import tqdm
from rknn.api import RKNN
import multiprocessing
def process(rknn_model, img_path):
# Create RKNN object
rknn = RKNN()
if not os.path.exists(rknn_model):
print('model not exist')
exit(-1)
rknn.load_rknn(rknn_model)
rknn.init_runtime(target='rk3588', core_mask=RKNN.NPU_CORE_AUTO)
img = cv2.imread(filename=img_path)
img = cv2.cvtColor(src=img, code=cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (640, 640))
for item in tqdm.trange(10):
result = rknn.inference(inputs=[img], data_format="nhwc")
# print(result[0].shape)
rknn.release()
def main():
img_path = "data/car.jpg"
dataset = "data/dataset.txt"
rknn_model = "models/yolov5s-det/13978/best_hybrid8_3_output.rknn"
p_count = 1
p_list = list()
for _ in range(p_count):
p = multiprocessing.Process(target=process, args=(rknn_model, img_path))
p_list.append(p)
for p in p_list:
p.start()
for p in p_list:
p.join()
print("process over")
if __name__ == '__main__':
main()
6.2 使用python rknnlite推理【邊緣設(shè)備端】
參考rknn_toolkit_lite2\examples\inference_with_lite\test.py編寫(xiě)測(cè)試?yán)印?/p>
import os
import cv2
import tqdm
import time
import platform
import numpy as np
from rknnlite.api import RKNNLite
def get_host():
device_compatible_node = '/proc/device-tree/compatible'
# get platform and device type
system = platform.system()
machine = platform.machine()
os_machine = system + '-' + machine
if os_machine == 'Linux-aarch64':
try:
with open(device_compatible_node) as f:
device_compatible_str = f.read()
if 'rk3588' in device_compatible_str:
host = 'RK3588'
else:
host = 'RK356x'
except IOError:
print('Read device node {} failed.'.format(device_compatible_node))
exit(-1)
else:
host = os_machine
return host
if __name__ == "__main__":
host_name = get_host()
print(host_name)
rknn_file = "model/yolov5s-det/13978/best.rknn"
rknn_file = "model/yolov5s-det/13978/best_int8.rknn"
if not os.path.exists(rknn_file):
print(f"{rknn_file} not exist")
exit(0)
rknn = RKNNLite(verbose=False, verbose_file="verbose.log")
rknn.load_rknn(rknn_file)
rknn.init_runtime(target=host_name, core_mask=RKNNLite.NPU_CORE_AUTO)
ori_img = cv2.imread('./data/car.jpg')
img = cv2.cvtColor(ori_img, cv2.COLOR_BGR2RGB)
resized_image = cv2.resize(img, (640, 640))
start_time = time.time()
for i in tqdm.tqdm(range(100)):
outputs = rknn.inference(inputs=[resized_image])
print(len(outputs), outputs[0].shape)
print(f"cost: {time.time() - start_time}, fps: {100 / (time.time() - start_time):.4f}")
rknn.release()
6.3 問(wèn)題
問(wèn)題1:AttributeError: rknnlite/api/lib/hardware/DOLPHIN/linux-aarch64/librknn_api.so: undefined symbol: rknn_set_core_mask
解決方法:出現(xiàn)這個(gè)問(wèn)題的原因是因?yàn)閘ibrknn_api.so的庫(kù)是有問(wèn)題的蹈丸,直接將rknpu2庫(kù)中的librknnrt.so 拷貝過(guò)去即可成黄,執(zhí)行命令如下:
# 備份
cp /opt/virtualenvs/rknn/lib/python3.8/site-packages/rknnlite/api/lib/hardware/DOLPHIN/linux-aarch64/librknn_api.so /opt/virtualenvs/rknn/lib/python3.8/site-packages/rknnlite/api/lib/hardware/DOLPHIN/linux-aarch64/librknn_api.so.bak
# 拷貝
cp /root/rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /opt/virtualenvs/rknn/lib/python3.8/site-packages/rknnlite/api/lib/hardware/DOLPHIN/linux-aarch64/librknn_api.so
問(wèn)題2:W RKNN: [01:59:35.391] Output(output0):size_with_stride larger than model origin size, if need run OutputOperator in NPU, please call rknn_create_memory using size_with_stride.
這個(gè)告警暫時(shí)可以忽略呐芥,可以在代碼中屏蔽告警。參考yolov5的前后處理:https://github.com/Applied-Deep-Learning-Lab/Yolov5_RK3588/blob/main/base/rk3588.py
from hide_warnings import hide_warnings
@hide_warnings
def rk_infer(self, img):
rk_result = self._rk_session.inference(inputs=[img], data_format="nhwc")
return rk_result
問(wèn)題3:如何選中使用的NPU奋岁,即如何設(shè)置core_mask贩耐?
self._rk_session.init_runtime(target=self.host_name, core_mask=RKNNLite.NPU_CORE_AUTO)
self._rk_session.init_runtime(target=self.host_name, core_mask=RKNNLite.NPU_CORE_0_1_2)
解決方法:通過(guò)RKNNLite.NPU_CORE_AUTO/RKNNLite.NPU_CORE_0_1_2/RKNNLite.NPU_CORE_0都可以設(shè)置使用那個(gè)NPU,測(cè)試發(fā)現(xiàn)NPU_CORE_AUTO NPU的利用率最高厦取。
7 其他
A1 鏡像燒錄步驟
參考
-【ROC-RK3568-PC開(kāi)發(fā)板試用體驗(yàn)】燒錄Ubuntu20.04系統(tǒng):https://dev.t-firefly.com/thread-124315-1-1.html
- 瑞芯微RK3588開(kāi)發(fā)板的固件燒錄完整教程 https://www.163.com/dy/article/HIE3O7VC0538CDS6.html
步驟1:安裝驅(qū)動(dòng)
- 下載DriverAssitant_v5.0.zip,解壓管搪,然后運(yùn)行里面的DriverInstall.exe虾攻。
注意若通過(guò)linux連接到板子時(shí),需要安裝驅(qū)動(dòng)adb
apt install adb
步驟2.連接設(shè)備
- 使用Type-c數(shù)據(jù)線連接設(shè)備USB_OTG接口更鲁,USB連接PC端霎箍,查看設(shè)備管理器(以下為安裝驅(qū)動(dòng)成功)
步驟3:整體固件【在官網(wǎng)下載對(duì)應(yīng)的固件https://www.t-firefly.com/doc/download/183.html】
- 下載網(wǎng)盤(pán)上所提供的固件,或者是選擇自己編譯出來(lái)的固件澡为,更新固件分為單個(gè)整體固件和分區(qū)鏡像漂坏;
1)選擇整體固件方式:解壓RK官方燒錄工具RKDevTool_Release_v2.92.zip,打開(kāi)RKDevTool.exe媒至,會(huì)自動(dòng)識(shí)別到ADB設(shè)備顶别;
2)按【切換】按鈕,進(jìn)入loader燒錄模式拒啰;
3)按【固件】按鈕驯绎,選擇要升級(jí)的固件文件,加載固件之后谋旦,然后點(diǎn)擊【升級(jí)】按鈕剩失,等待燒寫(xiě)為完成即可
步驟4:鏡像燒錄完之后,安裝必要的軟件
apt install git cmake g++ -y
apt install lrzsz tree vim htop -y
A2 RK3588運(yùn)行Docker
檢測(cè)板子中Docker的運(yùn)行環(huán)境
git clone https://github.com/moby/moby.git
cd moby/contrib
./check-config.sh .config
參考:https://blog.csdn.net/xiaoning132/article/details/130541520
板子中安裝Docker
## 卸載舊版本
sudo apt-get remove docker docker-engine docker.io containerd runc
## 設(shè)置存儲(chǔ)庫(kù)
sudo apt-get update -y
sudo apt-get install -y ca-certificates curl gnupg lsb-release
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
## 安裝 Docker
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
## 驗(yàn)證是否安裝成功
sudo docker run hello-world
參考:https://github.com/DHDAXCW/Rk3588-Docker
A3 GStreamer拉流
參考:
- 在RK3588上使用Gstreamer做推拉流并推理記錄:https://blog.csdn.net/xiaoxuanxuan12/article/details/130200378
- gstreamer常用知識(shí)總結(jié): https://blog.csdn.net/qq_32188669/article/details/103903032
- All you want, to get started with GStreamer in Python: https://sahilchachra.medium.com/all-you-want-to-get-started-with-gstreamer-in-python-2276d9ed548e
- Jetson TX2 Xvier 之 GStreamer+OpenCV讀取顯示攝像頭: https://blog.csdn.net/yiyayi1/article/details/108793611
安裝gstreamer
apt-get update
apt-get install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-doc gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio
apt-get install libunwind8-dev
apt-get install libgtk2.0-dev pkg-config
# 查看安裝結(jié)果
dpkg -l | grep gstreamer
使用命令測(cè)試gstreamer
# hello world
gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink
# Adding a capability to the pipeline
gst-launch-1.0 videotestsrc ! video/x-raw, format=BGR ! autovideoconvert ! ximagesink
# Setting width, height and framerate
gst-launch-1.0 videotestsrc ! video/x-raw, format=BGR ! autovideoconvert ! videoconvert ! video/x-raw, width=640, height=480, framerate=1/2 ! ximagesink
# rtsp
gst-launch-1.0 rtspsrc location=rtsp://172.18.18.202:5554/T_Perimeter_ball001 latency=10 ! queue ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! videoscale ! video/x-raw,width=640,height=480 ! ximagesink
使用opencv測(cè)試gstreamner
步驟1:重新編譯opencv册着。opencv默認(rèn)未開(kāi)啟gstreamer的支持拴孤,所以需要先重新編譯opencv
git clone https://github.com/opencv/opencv.git
cd opencv
mkdir build && cd build
cmake -D WITH_GSTREAMER=ON \
-D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D BUILD_opencv_python2=OFF \
-D BUILD_opencv_python3=ON \
-D PYTHON3_PACKAGES_PATH=/usr/local/lib/python3.8/dist-packages/ \
-D PYTHON3_LIBRARY=/usr/lib/python3.8/config-3.8-aarch64-linux-gnu/libpython3.8.so \
-D OPENCV_GENERATE_PKGCONFIG=YES ..
make -j6 && make install
cd /etc/ld.so.conf.d/ # 切換目錄
touch opencv.conf # 新建opencv配置文件
echo /usr/local/lib/ > opencv.conf # 填寫(xiě)opencv編譯后庫(kù)所在的路徑
sudo ldconfig
步驟2:使用下面腳本測(cè)試gstreamer
# opencv【需要重新編譯opencv并啟用WITH_GSTREAMER=ON】
import cv2
gstreamer_str = "sudo gst-launch-1.0 rtspsrc location=rtsp://172.18.18.202:5554/T_Perimeter_ball001 latency=1000 ! queue ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! videoscale ! video/x-raw,width=640,height=480,format=BGR ! appsink drop=1"
cap = cv2.VideoCapture(gstreamer_str, cv2.CAP_GSTREAMER)
print(cap.isOpened())
while(cap.isOpened()):
ret, frame = cap.read()
if ret:
cv2.imshow("Input via Gstreamer", frame)
if cv2.waitKey(25) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
遇到的問(wèn)題
問(wèn)題1:未安裝libgtk2.0-dev導(dǎo)致
Traceback (most recent call last):
File "opencv_demo.py", line 14, in <module>
cv2.destroyAllWindows()
cv2.error: OpenCV(4.8.0-dev) /root/code/opencv/modules/highgui/src/window.cpp:1266: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvDestroyAllWindows'
問(wèn)題2:無(wú)法運(yùn)行g(shù)stream,camke之后GStreamer顯示為No
Video I/O:
DC1394: YES (2.2.5)
FFMPEG: YES
avcodec: YES (58.54.100)
avformat: YES (58.29.100)
avutil: YES (56.31.100)
swscale: YES (5.5.100)
avresample: YES (4.0.0)
GStreamer: NO
v4l/v4l2: YES (linux/videodev2.h)
原因:沒(méi)有安裝apt-get install libunwind8-dev甲捏,導(dǎo)致的演熟。可以使用下面的代碼查看是否支持gstreamer
import cv2
print(cv2.getBuildInformation())
問(wèn)題3:fail to load module gail
解決方法:出現(xiàn)這個(gè)問(wèn)題的原因是因?yàn)閘ibrknn_api
sudo apt-get install libgail-common
GStreamer Python:https://github.com/GStreamer/gst-python
https://gist.github.com/liviaerxin/9934a5780f5d3fe5402d5986fc32d070
git clone https://github.com/GStreamer/gst-python.git
cd gst-python
GSTREAMER_VERSION=$(gst-launch-1.0 --version | grep version | tr -s ' ' '\n' | tail -1)
git checkout $GSTREAMER_VERSION
PYTHON=/usr/bin/python3.8
LIBPYTHON=$($PYTHON -c 'from distutils import sysconfig; print(sysconfig.get_config_var("LDLIBRARY"))')
LIBPYTHONPATH=$(dirname $(ldconfig -p | grep -w $LIBPYTHON | head -1 | tr ' ' '\n' | grep /))
PREFIX=$(dirname $(dirname $(which python3))) # in jetson nano, `PREFIX=~/.local` to use local site-packages,
LIBPYTHON=libpython3.8.so
LIBPYTHONPATH=/lib/aarch64-linux-gnu
PREFIX=/usr
./autogen.sh --disable-gtk-doc --noconfigure
./configure --with-libpython-dir=$LIBPYTHONPATH --prefix $PREFIX
make
make install
錯(cuò)誤:No package 'pygobject-3.0' found
apt install -y python-gi-dev
錯(cuò)誤:checking for headers required to compile python extensions... not found
configure: error: could not find Python headers
sudo apt-get install python3-dev libpython3-dev
checking for PYGOBJECT... yes
checking for libraries required to embed python... no
configure: error: Python libs not found. Windows requires Python modules to be explicitly linked to libpython