CTPN是什么
CTPN結(jié)合CNN與LSTM深度網(wǎng)絡(luò)勇垛,CTPN是從Faster R-CNN改進(jìn)而來(lái)味榛,能有效的檢測(cè)出復(fù)雜場(chǎng)景的橫向分布的文字,效果如圖1媚媒,是目前比較好的文字檢測(cè)算法篓像。詳細(xì)解釋?zhuān)?a target="_blank" rel="nofollow">傳送門(mén)
說(shuō)人話:文字識(shí)別的前期工作动知,需要把圖片中的文字區(qū)域定位出來(lái),然后可以做適當(dāng)?shù)牟们凶鬟M(jìn)一步的文字識(shí)別工作员辩!
本次要實(shí)現(xiàn)的項(xiàng)目
地址如下(別著急克屡钠狻):
項(xiàng)目要實(shí)現(xiàn)的效果
第一步--克隆項(xiàng)目
我克隆的是下面這個(gè)項(xiàng)目,他有訓(xùn)練好的權(quán)重文件:
第二步--開(kāi)始跑demo
結(jié)果碰到了下面這個(gè)錯(cuò)誤:
ImportError: cannot import name 'bbox'
回去找issue屈暗,果然有人跟我一樣拆讯,才看半截就開(kāi)跑
issue地址:https://github.com/eragonruan/text-detection-ctpn/issues/59
下面有人給出了答案:
照著做脂男,再跑一次,但是种呐,宰翅,,還是
another error:
Invalid argument: ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'
再去找解決方案爽室。找到了下面這個(gè):
issue地址:https://github.com/eragonruan/text-detection-ctpn/issues/59
解決方案如下(我只是部分修改汁讼,原作者是僅使用CPU執(zhí)行)
thanks to the author and [#43](https://github.com/eragonruan/text-detection-ctpn/issues/43) zhao181
my environment is:
windows10 ,
python3.6 ,
tensorflow1.3 ,
vs2015(ps:vs2013 not support python3.6 when compile)
step 1:make some change
change "np.int_t " to "np.intp_t" in line 25 of the file lib\utils\cython_nms.pyx
otherwise appear " ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long' " in step 6.
step 2:updata c file
execute:cd your_dir\text-detection-ctpn-master\lib\utils
execute:cython bbox.pyx
execute:cython cython_nms.pyx
step 3:builf setup file as setup_new.py
import numpy as np
from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension
numpy_include = np.get_include()
setup(ext_modules=cythonize("bbox.pyx"),include_dirs=[numpy_include])
setup(ext_modules=cythonize("cython_nms.pyx"),include_dirs=[numpy_include])
step 4:build .pyd file
execute:python setup_new.py install
copy bbox.cp36-win_amd64.pyd and cython_nms.cp36-win_amd64.pyd to your_dir\text-detection-ctpn-master\lib\utils
step 5:make some change
(1) Set "USE_GPU_NMS " in the file \ctpn\text.yml as "False"
(2) Set the "_*C.USE_GPU_NMS" in the file \lib\fast_rcnn\config.py as "False";
(3) Comment out the line "from lib.utils.gpu_nms import gpu_nms" in the file \lib\fast_rcnn\nms_wrapper.py;
(4) Comment out the line "from . import gpu_nms" in the file \lib\utils_*init**.py;
(5) change "base_name = image_name.split('/')[-1]" to "base_name = image_name.split('\')[-1]" in line 24 of the file ctpn\demo.py
step 6:run demo
execute:cd your_dir\text-detection-ctpn-master
execute:python ./ctpn/demo.py
關(guān)鍵改動(dòng)部分
step 1:make some change
change "np.int_t " to "np.intp_t" in line 25 of the file lib\utils\cython_nms.pyx
otherwise appear " ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long' " in step 6.
我再跑一次測(cè)試!
結(jié)果阔墩,嘿架,,
錯(cuò)誤信息如下:
Loading network VGGnet_test... Restoring from checkpoints/VGGnet_fast_rcnn_iter_50000.ckpt... done
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Demo for E:\business\recognition\text_detection\text-detection-ctpn\data\demo\010.png
Traceback (most recent call last):
File "./ctpn/demo.py", line 101, in <module>
ctpn(sess, net, im_name)
File "./ctpn/demo.py", line 61, in ctpn
draw_boxes(img, image_name, boxes, scale)
File "./ctpn/demo.py", line 26, in draw_boxes
with open('data/results/' + 'res_{}.txt'.format(base_name.split('.')[0]), 'w') as f:
OSError: [Errno 22] Invalid argument: 'data/results/res_E:\\business\\recognition\\text_detection\\text-detection-ctpn\\data\\demo\\010.txt'
定位這個(gè)問(wèn)題花了我一點(diǎn)時(shí)間啸箫,各種猜想耸彪,最后再回頭詳細(xì)讀一下這個(gè)錯(cuò)誤信息,發(fā)現(xiàn)了問(wèn)題所在忘苛,最后一句蝉娜!參數(shù)錯(cuò)誤!
回去改源碼:
base_name = image_name.split('\\')[-1]
with open('data\\results\\' + 'res_{}.txt'.format(base_name.split('.')[0]), 'w') as f:
只想說(shuō)一句扎唾,Windows的“\”真是召川。。胸遇。
我再跑荧呐!功夫不負(fù)有心人,輸出結(jié)果如下
查看輸出結(jié)果目錄
接下來(lái)就是去研究原理和源碼了纸镊,這個(gè)得放一段落倍阐,先把業(yè)務(wù)完成先。薄腻。。
再補(bǔ)一句届案,對(duì)這個(gè)感興趣的還可以去研究下下面這個(gè)項(xiàng)目庵楷,提供了數(shù)據(jù)集等