天津职业技术师范大学学报

2022, 01, v.32 26-32

一种改进注意力机制与关键区域的文本检测方法

史敦煌¹ 于雅楠¹

杜薇¹ 刘全²

1.天津职业技术师范大学信息技术工程学院 2.青软创新科技集团股份有限公司

基金项目(Foundation): 天津市自然科学基金资助项目（18JCYBJC84900）

邮箱(Email): jesuisyyn@126.com;

DOI: 10.19573/j.issn2095-0926.202201005

295	1	31
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

在对比传统文本检测和基于深度学习文本检测的光学字符识别应用效果基础上，提出一种改进的注意力连接文本提议网络（ACTPN）算法。该算法利用注意力机制的信息处理能力，增强网络对关键特征的提取效果；利用编号位置特征作为筛选依据去除冗余候选框，提高集装箱编号区域筛选的准确度；在训练策略中加入迁移学习方法，增强算法检测鲁棒性和系统可靠性。实验结果表明：改进的检测方法能够大幅度提高算法的检测精度，特别是在复杂环境中对集装箱编号检测的准确率可达88.83%，每张图片的检测耗时由原来的0.60 s减少到0.38 s。

关键词： 深度学习; 光学字符识别(OCR); 连接文本提议网络(CTPN); 注意力机制; 迁移学习;

Abstract：

By comparing traditional text detection and optical character recognition based on deep learning text detec tion, an improved attention connectionist text proposal network( ACTPN) algorithm is proposed. The algorithm makes use of the information-processing ability of the attention mechanism to enhance the extraction effect of the network on key features. To improve the screening accuracy at the container number area,the number position feature is used as the screening basis to remove redundant candidate boxes. Transfer learning is included in the training strategy to enhance the robustness of detection and system reliability. Experimental results show that the proposed detection method could greatly improve the detection accuracy of the algorithm. Especially in a complex environment,the accuracy of container number detection can reach 88.83 % and the detection time for each image is reduced from 0.60 s to 0.38 s.

KeyWords： deep learning; optical character recognition(OCR); connectionist text proposal network(CTPN); attention; transfer learning;

如需获取全文，请访问cnki.net

参考文献

[1]SARIKA N,SIRISALA N.Deep learning techniques for optical character recognition[C]//Sustainable Communication Networks and Application.Singapore:Springer,2021:339-349.

[2]TIAN Z,HUANG W L,HE T,et al.Detecting text in natural image with connectionist text proposal network[C]//Computer Vision-ECCV 2016.Cham:Springer,2016:56-72.

[3]SHI B G,BAI X,YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.

[4]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.

[5]BAY H,TUYTELAARS T,VAN GOOL L.SURF:speeded up robust features[C]//European Conference on Computer Vision.Heidelberg:Springer,2006:404-417.

[6]MATAS J,CHUM O,URBAN M,et al.Robust widebaseline stereo from maximally stable extremal regions[J].Image and Vision Computing,2004,22(10):761-767.

[7]HAC?EFENDIOGˇLU K,BA?SAGˇA H B,DEMIR G.Automatic detection of earthquake-induced ground failure effects through Faster R-CNN deep learning-based object detection using satellite images[J].Natural Hazards,2021,105(1):383-403.

[8]SATHISH K,RAMASUBBAREDDY S,GOVINDA K.Detection and localization of multiple objects using VGGNet and single shot detection[C]//Emerging Research in Data Engineering Systems and Computer Communications.Singapore:Springer,2020:427-439.

[9]MITTAL R,GARG A.Text extraction using OCR:a systematic review[C]//2020 Second International Conference on Inventive Research in Computing Applications(ICIRCA).Coimbatore:IEEE,2020:357-362.

[10]常参参.基于OCR技术的通用证件识别系统[D].南昌：南昌大学，2018.

[11]张龙坤，何舟桥，万武南.基于机器学习的截图识别翻译应用研究[J].网络安全技术与应用，2020(8):54-56.

[12]余烨，付源梓，陈维笑，等.自然场景下变形车牌检测模型DLPD-Net[J].中国图象图形学报，2021,26(3):556-567.

[13]LIN Q X,LUO C J,JIN L W,et al.STAN:a sequential transformation attention-based network for scene text recognition[J].Pattern Recognition,2021,111:107692.

[14]MALATHI T,SELVAMUTHUKUMARAN D,DIWAANCHANDAR C S,et al.An experimental performance analysis on robotics process automation(RPA)with open source OCRengines:microsoft ocr and google tesseract OCR[J].IOPConference Series:Materials Science and Engineering,2021,1059(1):012004.

[15]ZAREMBA W,SUTSKEVER I,VINYALS O.Recurrent neural network regularization[EB/OL].(2014-09-08)[2021-07-21].http：//xueshu.baidu.com/usercenter/paper/show?paperid=2e80bcd54f0971f2bd82e2ed31d6db12&site=xueshu_se.

[16]ET AL K A.Handwritten text recognition using deep learning and word beam search[J].Turkish Journal of Computer and Mathematics Education(TURCOMAT),2021,12(2):2905-2911.

[17]朱祯悦，吕淑静，吕岳.基于图匹配网络的小样本违禁物品分割算法[J].红外与激光工程，2019,9:1-18.

[18]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:7132-7141.

基本信息:

DOI：10.19573/j.issn2095-0926.202201005

中图分类号:TP391.41

引用信息:

[1]史敦煌,于雅楠,杜薇等.一种改进注意力机制与关键区域的文本检测方法[J].天津职业技术师范大学学报,2022,32(01):26-32.DOI:10.19573/j.issn2095-0926.202201005.

基金信息:

天津市自然科学基金资助项目（18JCYBJC84900）

请选择需要下载的pdf数据

天津职业技术师范大学学报

Summary

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文