马苗, 刘琳, 陈昱莅. 基于WMF-CNN模型的街景门牌号码识别*[J]. 云南大学学报(自然科学版), 2018, 40(3): 458-465. doi: 10.7540/j.ynu.20170581
引用本文: 马苗, 刘琳, 陈昱莅. 基于WMF-CNN模型的街景门牌号码识别*[J]. 云南大学学报(自然科学版), 2018, 40(3): 458-465. doi: 10.7540/j.ynu.20170581
MA Miao, LIU Lin, CHEN Yu-li. A WMF-CNN model for street view house numbers recognition[J]. Journal of Yunnan University: Natural Sciences Edition, 2018, 40(3): 458-465. DOI: 10.7540/j.ynu.20170581
Citation: MA Miao, LIU Lin, CHEN Yu-li. A WMF-CNN model for street view house numbers recognition[J]. Journal of Yunnan University: Natural Sciences Edition, 2018, 40(3): 458-465. DOI: 10.7540/j.ynu.20170581

基于WMF-CNN模型的街景门牌号码识别*

A WMF-CNN model for street view house numbers recognition

  • 摘要: 针对自然场景下街景门牌号码识别困难的问题,提出了一种基于深度网络模型的WMF-CNN(Convolutional neural network with weighted multi-feature fusion, WMF-CNN)模型.该模型采用加权多层特征图融合的思想,首先利用PCA方法对各特征融合图进行降维,然后再根据它们在网络识别过程中的贡献率给予一定的权值,将加权后的图像细节信息与全局逼近信息进行融合,最后将融合特征送入SoftMax分类器,得到识别结果.在国际公开的SVHN数据集上的实验结果表明,所提模型仅需2.2h便可完成训练,识别率达到95.6%,优于目前的主流算法.此外,所提模型识别单张图片所需的平均时间约为0.38ms,适用于实时性要求较高的相关应用.

     

    Abstract: In this paper,a WMF-CNN (Convolutional neural network with weighted multi-feature fusion,WMF-CNN) model based on deep learning is proposed to solve the problem of the recognition on street view house number images in natural scene.The model adopts the idea of weighted multi-layer feature fusion.The PCA method is used to reduce the dimensions of each fusion feature map and then corresponding weights are computed according to their contributions to recognition results.The weighted feature maps representing detailed information are fused with global approximation information provided by the fully connected layer.Finally,the fused features are input to the SoftMax classifier to get a more reasonable recognition result.Our experimental results on SVHN dataset indicate that the proposed WMF-CNN model could be fully trained within 2.2 hours and achieve the recognition rates of 95.6%.Compared with some other methods or models,the suggested WMF-CNN model not only can obtain higher accuracy,but also may meet some the requirements of real-time applications since it takes an average of about 0.38 milliseconds to recognize an image.

     

/

返回文章
返回