融合注意力的多尺度Faster RCNN的裂纹检测

陈海永; 赵鹏; 闫皓炜

doi:10.12086/oee.2021.200112

融合注意力的多尺度Faster RCNN的裂纹检测

- 1.
  河北工业大学人工智能与数据科学学院，天津 300000
- 2.
  天津航天中为数据系统科技有限公司，天津 300000
基金项目:
国家自然科学基金资助项目(61873315)

详细信息

作者简介:
陈海永(1980-)，男，博士，教授，主要从事计算机视觉的研究。E-mail：haiyong.chen@hebut.edu.cn

**^*通讯作者:** 陈海永, E-mail: haiyong.chen@hebut.edu.cn

中图分类号: TP391.41

收稿日期: 2020-04-02

修回日期: 2020-06-15

刊出日期: 2021-01-15

Crack detection based on multi-scale Faster RCNN with attention

- 1.
  School of Artificial Intelligence, Hebei University of Technology, Tianjin 300000, China
- 2.
  Tianjin Aerospace Zhongwei Data System Technology Co., Ltd, Tianjin 300000, China
Fund Project: National Natural Science Foundation of China (61873315)

More Information

**^*Corresponding author:** Chen Haiyong, E-mail: haiyong.chen@hebut.edu.cn

Received Date 02 April 2020

Revised Date 15 June 2020

Published Date 15 January 2021

摘要

摘要

电致发光(Electroluminescence, EL)下的光伏电池EL图像背景表现为复杂的非均匀纹理特征，且存在与裂纹相似的晶粒伪缺陷，同时裂纹表现为形状多样的多尺度特征，以上难点为检测任务带来了极大的挑战。因此，本文提出融合注意力的多尺度Faster-RCNN模型，一方面，采用改进的特征金字塔网络获取多尺度的高级语义特征图，以此来提高网络对多尺度裂纹缺陷的特征表达能力。另一方面，采用改进的注意力区域推荐网络A-RPN，提高模型对裂纹缺陷的关注并抑制复杂背景及晶粒伪缺陷的特征。同时，在RPN网络训练过程中，采用损失函数Focal loss，以此来降低训练过程中简单样本所占比重，使其更加关注难以区分的样本。实验结果表明，改进的算法使得EL图像裂纹缺陷检测的准确率提高，达到接近95%。
- 多尺度特征提取 /
- 注意力模块 /
- Focal loss函数
Abstract

The background of the EL image of a photovoltaic cell under electroluminescence (EL) presents complex non-uniform texture features, and there are grain pseudo-defects similar to cracks. At the same time, the cracks appear as multi-scale features with various shapes. The above mentioned difficulties have presented great challenges for the detection task. Therefore, this paper proposes a multi-scale Faster-RCNN model that integrates attention. On the one hand, an improved feature pyramid network is used to obtain multi-scale advanced semantic feature maps to improve the network's feature expression ability of multi-scale crack defects. On the other hand, an improved attention region proposal network A-RPN is adopted to increase the model's attention to crack defects and suppress the characteristics of complex background and grain pseudo-defects. At the same time, in the RPN network training process, a loss function Focal loss is used to reduce the proportion of simple samples in the training process, so that the model pays more attention to the samples that are difficult to distinguish. Experimental results show that this algorithm improves the accuracy of crack defect detection in EL images, reaching nearly 95%.
- multi-scale feature extraction /
- attention module /
- focal loss function

Overview

Overview

Overview: Electroluminescence (EL) images of photovoltaic cells have a non-uniformly textured complex background, and the background contains grain pseudo-defects that are highly similar to the crack structure. At the same time, the cracks are characterized by various sizes and shapes. Existing target detection algorithms based on convolutional neural networks cannot adapt to the above problems. From the perspective of suppressing interference from complex background and improving the adaptability of the model to multi-scale crack defect detection, this paper proposes a multi-scale Faster RCNN model that integrates attention. In photovoltaic cell EL images, the scale of the cracks varies greatly, including a large number of small target cracks. In order to improve the network's ability to express multi-scale crack defects, a path aggregation feature pyramid network (PA-FPN) is proposed. Based on the combination of the residual network ResNet50 and the feature pyramid network FPN, PA-FPN adds a bottom-up path to fuse features. PA-FPN effectively retains shallow feature information, which improves the model's adaptability to multi-scale cracks in EL images and especially the detection results of small-scale cracks. In order to improve the model's attention to crack defects and suppress the characteristics of complex background and grain pseudo-defects, this paper proposes a regional recommendation network A-RPN that incorporates convolutional block attention module (CBAM). CBAM is composed of a channel attention module and a spatial attention module. In this paper, it is experimentally verified that the detection result of the RPN network fused with CBAM is better than that of using an attention modules alone. K-means clustering is used to cluster the crack sizes in the data set to guide the RPN to set the anchor box closer to the actual crack size, which improves the speed and accuracy of the target box regression in the defect detection process. In addition, in the RPN network training process, the loss function Focal loss is used to replace the original cross-entropy loss function, so as to reduce the proportion of simple samples in the training process and make the model pay more attention to the samples that are difficult to distinguish. The entire network can achieve end-to-end training. In order to verify the effectiveness of the improved algorithm, the performance of the original Faster RCNN model, RetinaNet, and CenterNet on multi-scale crack detection of EL images is compared. Through training and testing of 1024 pixels×1024 pixels of photovoltaic cell EL images, experimental results show that the improved Faster RCNN is better than the above mentioned target detection algorithms in accuracy, and has good robustness to the strip-shaped multi-scale cracks, which can be adapted to the EL image with changing complex background.

HTML全文

图 1 EL成像采集系统

Figure 1. EL imaging acquisition system

下载: 全尺寸图片幻灯片

图 2 非均匀纹理随机背景的EL图像。

Figure 2. EL image of a random background with a non-uniform texture. The rectangular frame is the grain, the triangular frame marks the pseudo-defects of the grain that are highly similar to the crack, and the ellipse marks the crack

下载: 全尺寸图片幻灯片

图 3 融合注意力的多尺度Faster-RCNN模型

Figure 3. Multi-scale Faster-RCNN model with attention

下载: 全尺寸图片幻灯片

图 4 路径聚合特征金字塔PA-FPN

Figure 4. Path aggregation feature pyramid PA-FPN

下载: 全尺寸图片幻灯片

图 5 融合注意力CBAM的检测模型

Figure 5. Detection model with integrated CBAM

下载: 全尺寸图片幻灯片

图 6 特征图可视化对比

Figure 6. Visual comparison of feature maps

下载: 全尺寸图片幻灯片

图 7 RPN结合注意力CBAM前后的特征图

Figure 7. Feature map before and after RPN combined with attention CBAM

下载: 全尺寸图片幻灯片

图 8 不同算法在光伏电池EL图像上的检测结果对比图

Figure 8. Comparison of detection results of different algorithms on photovoltaic cell EL images

下载: 全尺寸图片幻灯片

表 1 光伏电池EL图像数据集

Table 1. Photovoltaic cell EL image data set

分辨率	训练集	测试集	合计
1024×1024	476	236	712

下载: 导出CSV

表 2 模型的参数配置

Table 2. Parameter configuration of the model

Image_resize	Weight_decay	Learning_rate	Network_batch_size
1024×1024	0.0005	0.0001	1

Momentum	RPN_proposals_train	RPN_proposals_test	RPN batch_size
0.9	2000	1000	256

Max_iteration	ROI_foreground threshold	ROI_background threshold	RPN_nms threshold
20000	(0.7, 1)	(0, 0.3)	0.7

下载: 导出CSV

表 3 基于Faster-RCNN算法的EL图像检测性能

Table 3. EL image detection performance based on Faster-RCNN algorithm

Faster-RCNN	Focal loss	注意力	PA-FPN	AP
ResNet50	-	-	-	87.68
	√	-	-	88.93
	√	√	-	92.26
	√	√	√	94.75

下载: 导出CSV

表 4 不同算法在光伏电池EL图像上的检测性能

Table 4. Detection performance of different algorithms on photovoltaic cell EL images

Method	骨干网络	AP
原始Faster RCNN	ResNet50	87.68
CenterNet^[5]	ResNet18	85.07
CenterNet^[5]	DLA	87.25
RetinaNet^[6]	ResNet50	84.53
改进的Faster RCNN	ResNet50	94.75

下载: 导出CSV

参考文献(18)

参考文献

[1]	Anwar S A, Abdullah M Z. Micro-crack detection of multicrystalline solar cells featuring shape analysis and support vector machines[C]//Proceedings of 2012 IEEE International Conference on Control System, Computing and Engineering, 2012: 143-148.
[2]	Su B Y, Chen H Y, Zhu Y F, et al. Classification of manufacturing defects in multicrystalline solar cells with novel feature descriptor[J]. IEEE Trans Instrum Meas, 2019, 68(12): 4675-4688. doi: 10.1109/TIM.2019.2900961
[3]	Luo Q W, Sun Y C, Li P C, et al. Generalized completed local binary patterns for time-efficient steel surface defect classification[J]. IEEE Trans Instrum Meas, 2019, 68(3): 667-679. doi: 10.1109/TIM.2018.2852918
[4]	Tsai D M, Chang C C, Chao S M. Micro-crack inspection in heterogeneously textured solar wafers using anisotropic diffusion[J]. Image Vis Comput, 2010, 28(3): 491-501. doi: 10.1016/j.imavis.2009.08.001
[5]	Cha Y J, Choi W, Büyüköztürk O. Deep learning‐based crack damage detection using convolutional neural networks[J]. Comput Aided Civ Inf Eng, 2017, 32(5): 361-378. doi: 10.1111/mice.12263
[6]	Lin H, Li B, Wang X G, et al. Automated defect inspection of LED chip using deep convolutional neural network[J]. J Intell Manuf, 2019, 30(6): 2525-2534. doi: 10.1007/s10845-018-1415-x
[7]	Duan K W, Bai S, Xie L X, et al. Centernet: keypoint triplets for object detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, 2019: 6568-6577.
[8]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 2999-3007.
[9]	Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 1440-1448.
[10]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91-99.
[11]	Cha Y J, Choi W, Suh G, et al. Autonomous structural visual inspection using region‐based deep learning for detecting multiple damage types[J]. Comput Aided Civ Inf Eng, 2018, 33(9): 731-747. doi: 10.1111/mice.12334
[12]	高琳, 陈念年, 范勇. 融合多尺度上下文卷积特征的车辆目标检测[J]. 光电工程, 2019, 46(4): 180331. doi: 10.12086/oee.2019.180331 Gao L, Chen N N, Fan Y. Vehicle detection based on fusing multi-scale context convolution features[J]. Opto-Electron Eng, 2019, 46(4): 180331. doi: 10.12086/oee.2019.180331
[13]	Liu S, Qi L, Qin H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[14]	Corbetta M, Shulman G L. Control of goal-directed and stimulus-driven attention in the brain[J]. Nat Rev Neurosci, 2002, 3(3): 201-215. doi: 10.1038/nrn755
[15]	Frazão M, Silva J A, Lobato K, et al. Electroluminescence of silicon solar cells using a consumer grade digital camera[J]. Measurement, 2017, 99: 7-12. doi: 10.1016/j.measurement.2016.12.017
[16]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[17]	Everingham M, Van Gool L, Williams C K I, et al. The PASCAL visual object classes (VOC) challenge[J]. Int J Comput Vis, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4
[18]	Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.

施引文献

资源附件(0)

访问统计

访问统计

点击扫一扫

图(8)

表(4)

计量

文章访问数:
PDF下载数:
施引文献: 0

融合注意力的多尺度Faster RCNN的裂纹检测

作者简介:
陈海永(1980-)，男，博士，教授，主要从事计算机视觉的研究。E-mail：haiyong.chen@hebut.edu.cn

**^*通讯作者:** 陈海永, E-mail: haiyong.chen@hebut.edu.cn

Crack detection based on multi-scale Faster RCNN with attention

**^*Corresponding author:** Chen Haiyong, E-mail: haiyong.chen@hebut.edu.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

目录

作者须知

其他内容

条款和政策

融合注意力的多尺度Faster RCNN的裂纹检测

作者简介: 陈海永(1980-)，男，博士，教授，主要从事计算机视觉的研究。E-mail：haiyong.chen@hebut.edu.cn

*通讯作者: 陈海永, E-mail: haiyong.chen@hebut.edu.cn

Crack detection based on multi-scale Faster RCNN with attention

*Corresponding author: Chen Haiyong, E-mail: haiyong.chen@hebut.edu.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
陈海永(1980-)，男，博士，教授，主要从事计算机视觉的研究。E-mail：haiyong.chen@hebut.edu.cn

**^*通讯作者:** 陈海永, E-mail: haiyong.chen@hebut.edu.cn

**^*Corresponding author:** Chen Haiyong, E-mail: haiyong.chen@hebut.edu.cn