级联网络和金字塔光流的旋转不变人脸检测

孙锐; 阚俊松; 吴柳玮; 王鹏

doi:10.12086/oee.2020.190135

级联网络和金字塔光流的旋转不变人脸检测

- 1.
  合肥工业大学计算机与信息学院，安徽合肥 230009
- 2.
  工业安全与应急技术安徽省重点实验室，安徽
- 3.
  合肥进毅智能技术有限公司，安徽合肥 230088
基金项目:
国家自然科学基金资助项目(61471154)；中央高校基本科研业务费专项资金资助项目(JZ2018YYPY0287)

详细信息

作者简介:
孙锐(1976-)，男，博士，教授，主要从事计算机视觉的研究。E-mail: sunrui@hfut.edu.cn

**^*通讯作者:** 阚俊松(1995-)，男，硕士研究生，主要从事计算机视觉的研究。E-mail: 2931338359@qq.com

中图分类号: TP391.41;TP183

收稿日期: 2019-03-25

修回日期: 2019-05-14

刊出日期: 2020-01-01

Rotating invariant face detection via cascaded networks and pyramidal optical flows

- 1.
  School of Computer and Information, Hefei University of Technology, Hefei, Anhui 230009, China
- 2.
  Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei, Anhui 230009, China
- 3.
  Hefei Jinyi Science and Technology, Hefei, Anhui 230088, China
Fund Project: Supported by National Natural Science Foundation of China (61471154) and Fundamental Research Funds for Central Universities (JZ2018YYPY0287)

More Information

**^*Corresponding author:** Kan Junsong, E-mail: 2931338359@qq.com

Received Date 25 March 2019

Revised Date 14 May 2019

Published Date 01 January 2020

摘要

摘要:
在无约束的开放空间中，由于面部姿态变化、背景环境复杂、运动模糊等，人脸检测仍是一个具有挑战性的任务。本文针对视频流中人脸检测存在的平面内旋转问题，将人脸关键点与金字塔光流相结合，提出了基于级联网络和金字塔光流的旋转不变人脸检测算法。首先利用级联渐进卷积神经网络对视频流中前一帧进行人脸位置和关键点的定位；其次为获取关键点与人脸候选框间光流映射，使用独立的关键点检测网络对当前帧进行再次定位；之后计算前后两帧之间关键点光流位移；最后通过关键点光流位移与人脸候选框的映射关系，对视频中检测到的人脸进行校正，从而完成平面内旋转人脸不变性检测。实验经FDDB公开数据集上测试，证明该方法精确度较高。并且，在Boston面部跟踪数据集上进行动态测试，证明该人脸检测算法能有效解决平面内旋转人脸检测问题。对比其它检测算法，该算法检测速度有较大优势，同时视频中窗口抖动问题得到了很好解决。
- 旋转不变性 /
- 关键点检测 /
- 级联渐进网络 /
- 金字塔光流 /
- 人脸检测
Abstract:
In the unconstrained open-space, face detection is still a challenging task due to the facial posture changes, complex background environment, and motion blur. The rotation-invariant algorithm based on cascaded network and pyramid optical flow is proposed. Firstly, the cascading progressive convolutional neural network is adopted to locate the face position and facial landmark of the previous frame in the video stream. Secondly, the independent facial landmark detection network is used to reposition the current frame, and the optical flow mapping displacement of the facial landmark between the two frames is calculated afterwards. Finally, the detected face is corrected by the mapping relationship between the optical flow displacement of the facial landmark and the bounding box, thereby completing the rotation-invariant face detection. The experiment was tested on the FDDB public datasets, which proved that the method is more accurate. Moreover, the dynamic test on the Boston head tracking database proves that the face detection algorithm can effectively solve the problem of rotation-invariant face detection. Compared with other detection algorithms, the detection speed of the proposed algorithm has a great advantage, and the window jitter problem in the video is well solved.
- rotation-invariant /
- facial landmark /
- calibration networks /
- pyramid optical flow /
- face detection

Overview

Overview: In recent years, with the rapid deployment of video surveillance in urban space, the public security department has collected video in a massive unconstrained open environment. There are complex problems such as scale change, partial occlusion, motion blur and illumination change in the face detection of video stream. In particular, face rotation affects the performance and efficiency of the entire face recognition system. In this paper, the in-plane rotation problem of face detection in video stream is combined with the pyramid optical flow, and a rotating invariant face detection algorithm based on cascaded network and pyramid optical flow is proposed. Firstly, the cascading progressive convolutional neural network is used to locate the face position and facial landmark of the previous frame in the video stream. Secondly, the optical flow mapping between the facial landmark and the bounding box is obtained, and the independent facial landmark network is used to detect the current frame. After that, the optical flow displacement of the key points between the two frames is calculated. Finally, the detected face of the video is corrected by the mapping relationship between the optical flow displacement of the key point and the face candidate frame, thereby completing the rotation-invariant face detection. The experiments were tested on the FDDB public datasets. The ROC curve on the FDDB evaluates the performance of the face detection method. When the number of false positives is less than 160, the performance of our method is better than other methods. When the number of false positives is more than 160, the face detection result is close to Face R-CNN, which proves that the method has higher accuracy. Moreover, the dynamic test on the Boston head tracking database proves that the face detection algorithm can effectively solve the problem of rotation and scale change of the target area in the plane. The speed of this algorithm with other rotationally invariant face detectors on standard.mp4 video is compared. The minimum face size of these images is 100×100. The experimental video has a uniform length of 10 s, a frame rate of 30 frames/s, and a picture size of 640×480. Experiments show that the algorithm detection speed has a great advantage, and the window jitter problem in the video is well solved. The average detection rate of the algorithm in this paper is higher than the general video frame rate, and the model size is small, which is suitable for mobile devices. Time costs are greatly reduced compared to the methods of learning rotational invariant features and segmenting samples by highly complex networks.

HTML全文

图 1 整体流程图

Figure 1. Flow chart of developed algorithm

下载: 全尺寸图片幻灯片

图 2 三阶段级联渐进卷积神经网络。(a)第一阶段；(b)第二阶段；(c)第三阶段

Figure 2. Three-stage cascade progressive convolutional neural network. (a) First stage; (b) Second stage; (c) Third stage

下载: 全尺寸图片幻灯片

图 3 特征点提取网络

Figure 3. Facial landmark extraction network

下载: 全尺寸图片幻灯片

图 4 人脸金字塔光流示意图

Figure 4. Face pyramid optical flow

下载: 全尺寸图片幻灯片

图 5 光流映射。(a)原始人脸图像；(b)关键点及候选框检测图；(c)关键点映射；(d)校正结果

Figure 5. Optical flow mapping. (a) Original face image; (b) Facial landmark and bounding box detection; (c) Facial landmark mapping; (d) Calibration result

下载: 全尺寸图片幻灯片

图 6 人脸检测方法比较

Figure 6. Comparison of face detection methods

下载: 全尺寸图片幻灯片

图 7 FDDB数据集上检测结果

Figure 7. Results on the FDDB

下载: 全尺寸图片幻灯片

图 8 Boston头部跟踪数据集检测结果

Figure 8. Test results on Boston head tracking database

下载: 全尺寸图片幻灯片

图 9 多用户人脸视频检测结果

Figure 9. Test results on Multi-face video

下载: 全尺寸图片幻灯片

表 1 视频流中人脸检测算法效率及相应模型大小

Table 1. Speed and model size between different methods

Method	CPU/fps	GPU/fps	Model size/M
Faster R-CNN(VGGM)	1	21	350
Faster R-CNN(VGG16)	0.5	11	547
Cascade CNN	31	68	4.2
Cascade CNN+STN	15	31	4.7
Divide-and-Conquer	14	21	2.2
SSD500	1	21	95
R-FCN(ResNet-50)	0.8	16	123
Our	34	64	3.7

下载: 导出CSV

参考文献(29)

[1]	Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001.
[2]	Li B, Yang A M, Yang J. Rotated face detection using AdaBoost[C]//Proceedings of 2010 2nd International Conference on Information Engineering and Computer Science, 2010: 1-4.
[3]	Froba B, Ernst A. Face detection with the modified census transform[C]//Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004: 91-96.
[4]	Jin H L, Liu Q S, Lu H Q, et al. Face detection using improved LBP under Bayesian framework[C]//Proceedings of the Third International Conference on Image and Graphics, 2004: 306-309.
[5]	Farfade S S, Saberian M J, Li L J. Multi-view face detection using deep convolutional neural networks[C]//Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015: 643-650.
[6]	Ranjan R, Patel V M, Chellappa R. A deep pyramid deformable part model for face detection[C]//Proceedings of 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems, 2015.
[7]	Yang S, Luo P, Loy C C, et al. From facial parts responses to face detection: a deep learning approach[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015.
[8]	Bas A, Huber P, Smith W A P, et al. 3D morphable models as spatial transformer networks[C]//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops, 2017.
[9]	李小薪, 梁荣华.有遮挡人脸识别综述:从子空间回归到深度学习[J].计算机学报, 2018, 41(1): 177-207. http://d.old.wanfangdata.com.cn/Periodical/jsjxb201801011 Li X X, Liang R H. A review for face recognition with occlusion: from subspace regression to deep learning[J]. Chinese Journal of Computers, 2018, 41(1): 177-207. http://d.old.wanfangdata.com.cn/Periodical/jsjxb201801011
[10]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91-99.
[11]	Liu W, Anguelov D, Erhan D, et al. Single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Compute Vision (ECCV), 2016: 21-37.
[12]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[13]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015.
[14]	Li H X, Lin Z, Shen X H, et al. A convolutional neural network cascade for face detection[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 5325-5334.
[15]	潘榕, 魏慧琴.基于旋转不变性的人脸定位识别研究[J].计算机工程与设计, 2009, 30(8): 1941-1943, 1997. http://d.old.wanfangdata.com.cn/Periodical/jsjgcysj200908035 Pan R, Wei H Q. Research on human face detection and recognition based on rotation invariance[J]. Computer Engineering and Design, 2009, 30(8): 1941-1943, 1997. http://d.old.wanfangdata.com.cn/Periodical/jsjgcysj200908035
[16]	王炜强, 张晓阳, 曹春芹, 等.尺度不变单样本人脸识别方法[J].中国图象图形学报, 2012, 17(3): 380-386. http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201203012 Wang W Q, Zhang X Y, Gao C Q, et al. Scale invariant face recognition from single sample[J]. Journal of Image and Graphics, 2012, 17(3): 380-386. http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201203012
[17]	包晓安, 胡玲玲, 张娜, 等.基于级联网络的快速人脸检测算法[J].浙江理工大学学报, 2019, 41(3): 347-353. doi: 10.3969/j.issn.1673-3851(n).2019.03.011 Bao X A, Hu L L, Zhang N, et al. Fast face detection algorithm based on cascade network[J]. Journal of Zhejiang Sci-Tech University, 2019, 41(3): 347-353. doi: 10.3969/j.issn.1673-3851(n).2019.03.011
[18]	刘伟强.基于级联卷积神经网络的人脸检测算法的研究[D].厦门: 厦门大学, 2017. Liu W Q. Research on face detection algorithm based on cascaded convolutional neural networks[D]. Xiamen: Xiamen University, 2017.http://cdmd.cnki.com.cn/Article/CDMD-10384-1017120170.htm
[19]	孙康, 李千目, 李德强.基于级联卷积神经网络的人脸检测算法[J].南京理工大学学报, 2018, 42(1): 40-47. http://d.old.wanfangdata.com.cn/Periodical/njlgdxxb201801006 Sun K, Li Q M, Li D Q. Face detection algorithm based on cascaded convolutional neural network[J]. Journal of Nanjing University of Science and Technology, 2018, 42(1): 40-47. http://d.old.wanfangdata.com.cn/Periodical/njlgdxxb201801006
[20]	林露樾.融合卷积神经网络以及光流法的目标跟踪方法[D].广州: 广东工业大学, 2018. Lin L Y. A visual object tracking method via CNN and optical flow with online learning[D]. Guangzhou: Guangdong University of Technology, 2018.
[21]	王正来, 黄敏, 朱启兵, 等.基于深度卷积神经网络的运动目标光流检测方法[J].光电工程, 2018, 45(8): 38-47. doi: 10.12086/oee.2018.180027 Wang Z L, Huang M, Zhu Q B, et al. The optical flow detection method of moving target using deep convolution neural network[J]. Opto-Electronic Engineering, 2018, 45(8): 38-47. doi: 10.12086/oee.2018.180027
[22]	Zhang K P, Zhang Z P, Li Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. doi: 10.1109/LSP.2016.2603342
[23]	Yang S, Luo P, Loy C C, et al. WIDER FACE: a face detection benchmark[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[24]	Liu Z W, Luo P, Wang X G, et al. Deep learning face attributes in the wild[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 3730-3738.
[25]	Sun Y, Wang X G, Tang X O. Deep convolutional network cascade for facial point detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 3476-3483.
[26]	Köstinger M, Wohlhart P, Roth P M, et al. Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization[C]//Proceedings of 2011 IEEE International Conference on Computer Vision Workshops, 2011: 2144-2151.
[27]	Jain V, Learned-Miller E G. FDDB: A benchmark for face detection in unconstrained settings[R]. UMass Amherst Technical Report, 2010.
[28]	Cascia M L, Sclaroff S. Fast, reliable head tracking under varying illumination[C]// Proceedings of 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999: 604-610.
[29]	Wang H, Li Z F, Ji X, et al. Face R-CNN[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017.

施引文献

资源附件(0)

访问统计

点击扫一扫

图(9)

表(1)

计量

文章访问数:
PDF下载数:
施引文献: 0

级联网络和金字塔光流的旋转不变人脸检测

作者简介:
孙锐(1976-)，男，博士，教授，主要从事计算机视觉的研究。E-mail: sunrui@hfut.edu.cn

**^*通讯作者:** 阚俊松(1995-)，男，硕士研究生，主要从事计算机视觉的研究。E-mail: 2931338359@qq.com

Rotating invariant face detection via cascaded networks and pyramidal optical flows

**^*Corresponding author:** Kan Junsong, E-mail: 2931338359@qq.com

计量

目录

作者须知

其他内容

条款和政策

级联网络和金字塔光流的旋转不变人脸检测

作者简介: 孙锐(1976-)，男，博士，教授，主要从事计算机视觉的研究。E-mail: sunrui@hfut.edu.cn

*通讯作者: 阚俊松(1995-)，男，硕士研究生，主要从事计算机视觉的研究。E-mail: 2931338359@qq.com

Rotating invariant face detection via cascaded networks and pyramidal optical flows

*Corresponding author: Kan Junsong, E-mail: 2931338359@qq.com

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
孙锐(1976-)，男，博士，教授，主要从事计算机视觉的研究。E-mail: sunrui@hfut.edu.cn

**^*通讯作者:** 阚俊松(1995-)，男，硕士研究生，主要从事计算机视觉的研究。E-mail: 2931338359@qq.com

**^*Corresponding author:** Kan Junsong, E-mail: 2931338359@qq.com