面向事件相机探测无人机的双视图融合检测方法

李淼,陈诺,安玮,等. 面向事件相机探测无人机的双视图融合检测方法[J]. 光电工程,2024,51(11): 240208. doi: 10.12086/oee.2024.240208
引用本文: 李淼,陈诺,安玮,等. 面向事件相机探测无人机的双视图融合检测方法[J]. 光电工程,2024,51(11): 240208. doi: 10.12086/oee.2024.240208
Li M, Chen N, An W, et al. Dual view fusion detection method for event camera detection of unmanned aerial vehicles[J]. Opto-Electron Eng, 2024, 51(11): 240208. doi: 10.12086/oee.2024.240208
Citation: Li M, Chen N, An W, et al. Dual view fusion detection method for event camera detection of unmanned aerial vehicles[J]. Opto-Electron Eng, 2024, 51(11): 240208. doi: 10.12086/oee.2024.240208

面向事件相机探测无人机的双视图融合检测方法

  • 基金项目:
    国家自然科学基金资助项目 (62401591,62401589);国家自然科学基金创新研究群体项目(61921001)
详细信息
    作者简介:
    *通讯作者: 安玮,anwei@nudt.edu.cn。
  • 中图分类号: TP391

  • CSTR: 32245.14.oee.2024.240208

Dual view fusion detection method for event camera detection of unmanned aerial vehicles

  • Fund Project: Project supported by National Natural Science Foundation of China (62401591, 61921001), and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (61921001)
More Information
  • 随着低空无人机的广泛应用,实时检测此类低慢小目标对维护公共安全至关重要。传统相机以固定曝光时间成像获得图像帧,在光照变化时难以自适应,导致在强光照等场景下存在探测盲区。事件相机作为一种新型的神经形态传感器,逐像素感知外部亮度变化差异,在复杂光照条件下依然可以生成高频稀疏的事件数据。针对基于图像的检测方法难以适应事件相机稀疏不规则数据的难题,本文将二维目标检测任务建模为三维时空点云中的语义分割任务,提出了一种基于双视图融合的无人机目标分割模型。基于事件相机采集真实的无人机探测数据集,实验结果表明,所提方法在保证实时性的前提下具有最优的检测性能,实现了无人机目标的稳定检测。

  • Overview: With the proliferation of UAVs in various industries, concerns over airspace management, public safety, and privacy have escalated. Traditional detection methods, such as acoustic, radio, and radar detection, are often costly and complex, limiting their applicability. In contrast, machine vision-based techniques, particularly event cameras, offer a cost-effective and adaptable solution. Event cameras, also known as dynamic vision sensors (DVS), capture visual information in the form of asynchronous events, each containing the pixel location, timestamp, and polarity of an intensity change. This address-event representation (AER) mechanism enables efficient high-speed and high-dynamic-range visual data processing. Traditional cameras capture image frames with a fixed exposure time, which makes it difficult to adapt to changes in lighting conditions, resulting in the detection of blind spots in strong light and other scenes. Event cameras perform well in this situation. The existing object detection algorithms are suitable for synchronous data such as image frames, but cannot handle asynchronous data such as data streams. The proposed method treats UAV detection as a 3D point cloud segmentation task, leveraging the unique properties of event streams. This paper introduces a dual-branch network, PVNet, which integrates voxel and point feature extraction branches. The voxel branch, resembling a U-Net architecture, extracts contextual information through downsampling and upsampling stages. The point branch utilizes multi-layer perceptrons (MLPs) to extract point-wise features. These two branches interact and fuse at multiple stages, enhancing feature representation. A key innovation lies in the gated fusion mechanism, which selectively aggregates features from both branches, mitigating the impact of non-informative features. This approach outperforms simple addition or concatenation fusion methods, as demonstrated through ablation studies. The method is evaluated on a custom dataset, Ev-UAV, containing 60 event camera recordings of UAV flights under various conditions. Evaluation metrics include intersection over union (IoU) for segmentation accuracy and mean average precision (mAP) for detection performance. Compared to frame-based methods like YOLOV7, SSD, and FastRCNN, event-based methods like RVT and SAST, and point cloud-based methods like PointNet and PointNet++, the proposed PVNet achieves superior performance, with an mAP of 69.5% and IoU of 70.3%, while maintaining low latency. By modelling UAV detection as a 3D point cloud segmentation task and leveraging the strengths of event cameras, the PVNet model effectively addresses the challenges of detecting small, feature-scarce UAVs in event data. The results demonstrate the potential of event cameras and the proposed fusion mechanism for real-time and accurate UAV detection in complex environments.

  • 加载中
  • 图 1  事件相机与传统相机输出比较

    Figure 1.  Comparison of output between event camera and traditional camera

    图 2  事件相机无人机检测数据集。(a)事件点云,蓝色为背景,红色为无人机目标;(b)可见光图像帧;(c)事件帧,红色框内为目标

    Figure 2.  Event camera drone detection dataset. (a) Event point cloud, with blue background and red drone target; (b) Visible light image frame; (c) Event frame, with the target within the red box

    图 3  点-体素融合分割网络

    Figure 3.  Point-voxel fusion segmentation network

    图 4  事件点云体素化

    Figure 4.  Event point cloud voxelization

    图 5  点-体素特征交互融合模块

    Figure 5.  Point-voxel feature interaction fusion module

    图 6  可视化检测结果对比。红色框为正确检测结果,黄色框为虚警,绿色框为漏检

    Figure 6.  Visual comparison of detection results. The red box represents the correct detection result, the yellow box represents false alarm, and the green box represents missed detection

    表 1  不同算法在Ev-UAV上的实验结果

    Table 1.  Experimental results of different algorithms on Ev-UAV dataset

    方法 框架 $ {P}_{{\mathrm{d}}} $/% $ {P}_{{\mathrm{f}}} $/% AP IoU 推理时间/ms
    SSD 88.3(+8.2) 13.3(−10.1) 62.7(+6.8) 18.7
    FastRCNN 88.7(+7.8) 13.5(−10.3) 63.2(+6.3) 19.5
    YOLOV7 89.8(+6.7) 10.3(−7.1) 65.3(+4.2) 16.5
    RVT 90.3(+6.2) 10.1(−6.9) 65.7(+3.8) 10.5
    SAST 90.1(+6.4) 10.5(−7.3) 65.5(+4.0) 13.2
    DNA-Net 90.5(+6.0) 9.91(−6.7) 65.9(+3.6) 20.3
    OSCAR 90.3(+6.2) 10.2(−7.0) 66.3(+3.2) 25.3
    Pointnet 点云 91.3(+5.2) 8.35(−5.1) 66.1(+3.4) 66.7(+3.6) 19.8
    Pointnet++ 点云 93.5(+3.0) 7.33(−4.1) 67.3(+2.2) 67.5(+2.8) 25.3
    Ours 点云 96.5 3.21 69.5 70.3 15.2
    下载: 导出CSV

    表 2  不同视图的影响。V和V+表示不同的体素分辨率: V表示4 pixel×4 pixel×20 ms,V+表示2 pixel×2 pixel×10 ms

    Table 2.  The impact of different views. V and V+ represent different voxel resolutions: V represents 4 pixel × 4 pixel × 20 ms, V+ represents 2 pixel × 2 pixel × 10 ms

    方法 $ {P}_{\rm{d}} $/% $ {P}_{\rm{f}} $/% AP IoU/% 推理时间/ms
    P 89.6 (+6.9) 9.86(−6.65) 66.3 (+3.2) 63.2 (+7.1) 7.2
    V 90.6 (+5.9) 8.98 (−5.77) 65.7 (+3.8) 63.8 (+6.5) 10.2
    V+ 92.5 (+4.0) 5.33 (−2.12) 66.8 (+2.7) 64.7 (+5.6) 12.3
    PV 94.3 (+2.2) 4.32 (−1.11) 67.2 (+2.3) 69.2 (+1.1) 19.2
    PV+ 96.5 3.21 69.5 70.3 20.3
    下载: 导出CSV

    表 3  不同融合方法的有效性

    Table 3.  The effectiveness of different fusion methods

    方法 $ {P}_{\rm{d}} $/% $ {P}_{\rm{f}} $/% AP IoU
    Add 90.2 (+6.3) 8.63 (−5.42) 67.3 (+2.2) 66.8 (+3.7)
    Multi 91.8 (+4.7) 7.68 (−4.47) 66.9 (+2.6) 64.2 (+6.1)
    Concat 93.8 (+2.7) 6.38 (3.17) 67.6 (+1.9) 65.3 (+5.0)
    Ours 96.5 3.21 69.5 70.3
    下载: 导出CSV
  • [1]

    Bouguettaya A, Zarzour H, Kechida A, et al. Vehicle detection from UAV imagery with deep learning: a review[J]. IEEE Trans Neural Netw Learn Syst, 2022, 33(11): 6047−6067. doi: 10.1109/TNNLS.2021.3080276

    [2]

    陈海永, 刘登斌, 晏行伟. 基于IDOU-YOLO的红外图像无人机目标检测算法[J]. 应用光学, 2024, 45(4): 723−731.

    Chen H Y, Liu D B, Yan X W. Infrared image UAV target detection algorithm based on IDOU-YOLO[J]. Journal of Applied Optics, 2024, 45(4): 723−731.

    [3]

    韩江涛, 谭凯, 张卫国, 等. 协同随机森林方法和无人机LiDAR空谱数据的盐沼植被“精灵圈”识别[J]. 光电工程, 2024, 51(3): 230188. doi: 10.12086/oee.2024.230188

    Han J T, Tan K, Zhang W G, et al. Identification of salt marsh vegetation “fairy circles” using random forest method and spatial-spectral data of unmanned aerial vehicle LiDAR[J]. Opto-Electron Eng, 2024, 51(3): 230188. doi: 10.12086/oee.2024.230188

    [4]

    Park S, Choi Y. Applications of unmanned aerial vehicles in mining from exploration to reclamation: a review[J]. Minerals, 2020, 10(8): 663. doi: 10.3390/min10080663

    [5]

    Sziroczak D, Rohacs D, Rohacs J. Review of using small UAV based meteorological measurements for road weather management[J]. Prog Aerosp Sci, 2022, 134: 100859. doi: 10.1016/j.paerosci.2022.100859

    [6]

    Wang Z Y, Gao Q, Xu J B, et al. A review of UAV power line inspection[C]//Proceedings of 2020 International Conference on Guidance, Navigation and Control, Tianjin, 2022: 3147–3159. https://doi.org/10.1007/978-981-15-8155-7_263.

    [7]

    Khan A, Gupta S, Gupta S K. Emerging UAV technology for disaster detection, mitigation, response, and preparedness[J]. J Field Robot, 2022, 39(6): 905−955. doi: 10.1002/rob.22075

    [8]

    Li Y, Liu M, Jiang D D. Application of unmanned aerial vehicles in logistics: a literature review[J]. Sustainability, 2022, 14(21): 14473. doi: 10.3390/su142114473

    [9]

    Mademlis I, Mygdalis V, Nikolaidis N, et al. High-level multiple-UAV cinematography tools for covering outdoor events[J]. IEEE Trans Broadcast, 2019, 65(3): 627−635. doi: 10.1109/TBC.2019.2892585

    [10]

    奚玉鼎, 于涌, 丁媛媛, 等. 一种快速搜索空中低慢小目标的光电系统[J]. 光电工程, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654

    Xi Y D, Yu Y, Ding Y Y, et al. An optoelectronic system for fast search of low slow small target in the air[J]. Opto-Electron Eng, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654

    [11]

    张润梅, 肖钰霏, 贾振楠, 等. 改进YOLOv7的无人机视角下复杂环境目标检测算法[J]. 光电工程, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051

    Zhang R M, Xiao Y F, Jia Z N, et al. Improved YOLOv7 algorithm for target detection in complex environments from UAV perspective[J]. Opto-Electron Eng, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051

    [12]

    陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372 doi: 10.12086/oee.2022.210372

    Chen X, Peng D L, Gu Y. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electron Eng, 2022, 49(3): 210372 doi: 10.12086/oee.2022.210372

    [13]

    张明淳, 牛春晖, 刘力双, 等. 用于无人机探测系统的红外小目标检测算法[J]. 激光技术, 2024, 48(1): 114−120. doi: 10.7510/jgjs.issn.1001-3806.2024.01.018

    Zhang M C, Niu C H, Liu L S, et al. Infrared small target detection algorithm for UAV detection system[J]. Laser Technol, 2024, 48(1): 114−120. doi: 10.7510/jgjs.issn.1001-3806.2024.01.018

    [14]

    Sedunov A, Haddad D, Salloum H, et al. Stevens drone detection acoustic system and experiments in acoustics UAV tracking[C]//Proceedings of 2019 IEEE International Symposium on Technologies for Homeland Security, Woburn, 2019: 1–7. https://doi.org/10.1109/HST47167.2019.9032916.

    [15]

    Chiper F L, Martian A, Vladeanu C, et al. Drone detection and defense systems: Survey and a software-defined radio-based solution[J]. Sensors, 2022, 22(4): 1453. doi: 10.3390/s22041453

    [16]

    de Quevedo Á D, Urzaiz F I, Menoyo J G, et al. Drone detection and radar-cross-section measurements by RAD-DAR[J]. IET Radar Sonar Navig, 2019, 13(9): 1437−1447. doi: 10.1049/iet-rsn.2018.5646

    [17]

    Gallego G, Delbrück T, Orchard G, et al. Event-based vision: a survey[J]. IEEE Trans Pattern Anal Mach Intell, 2020, 44(1): 154−180. doi: 10.1109/TPAMI.2020.3008413

    [18]

    Shariff W, Dilmaghani M S, Kielty P, et al. Event cameras in automotive sensing: a review[J]. IEEE Access, 2024, 12: 51275−51306. doi: 10.1109/ACCESS.2024.3386032

    [19]

    Paredes-Vallés F, Scheper K Y W, De Croon G C H E. Unsupervised learning of a hierarchical spiking neural network for optical flow estimation: from events to global motion perception[J]. IEEE Trans Pattern Anal Mach Intell, 2020, 42(8): 2051−2064. doi: 10.1109/TPAMI.2019.2903179

    [20]

    Cordone L, Miramond B, Thierion P. Object detection with spiking neural networks on automotive event data[C]// Proceedings of 2022 International Joint Conference on Neural Networks, Padua, 2022: 1–8. https://doi.org/10.1109/IJCNN55064.2022.9892618.

    [21]

    Li Y J, Zhou H, Yang B B, et al. Graph-based asynchronous event processing for rapid object recognition[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 914–923. https://doi.org/10.1109/ICCV48922.2021.00097.

    [22]

    Schaefer S, Gehrig D, Scaramuzza D. AEGNN: asynchronous event-based graph neural networks[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 2022: 12361–12371. https://doi.org/10.1109/CVPR52688.2022.01205.

    [23]

    Jiang Z Y, Xia P F, Huang K, et al. Mixed frame-/event-driven fast pedestrian detection[C]// Proceedings of 2019 International Conference on Robotics and Automation, Montreal, 2019: 8332–8338. https://doi.org/10.1109/ICRA.2019.8793924.

    [24]

    Lagorce X, Orchard G, Galluppi F, et al. HOTS: a hierarchy of event-based time-surfaces for pattern recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(7): 1346−1359. doi: 10.1109/TPAMI.2016.2574707

    [25]

    Zhu A, Yuan L Z, Chaney K, et al. Unsupervised event-based learning of optical flow, depth, and egomotion[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 989–997. https://doi.org/10.1109/CVPR.2019.00108.

    [26]

    Wang D S, Jia X, Zhang Y, et al. Dual memory aggregation network for event-based object detection with learnable representation[C]//Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, 2023: 2492–2500. https://doi.org/10.1609/aaai.v37i2.25346.

    [27]

    Li J N, Li J, Zhu L, et al. Asynchronous spatio-temporal memory network for continuous event-based object detection[J]. IEEE Trans Image Process, 2022, 31: 2975−2987. doi: 10.1109/TIP.2022.3162962

    [28]

    Peng Y S, Zhang Y Y, Xiong Z W, et al. GET: group event transformer for event-based vision[C]//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision, Paris, 2023: 6015–6025. https://doi.org/10.1109/ICCV51070.2023.00555.

    [29]

    Chen N F Y. Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, 2018: 644–653. https://doi.org/10.1109/CVPRW.2018.00107.

    [30]

    Afshar S, Nicholson A P, Van S A, et al. Event-based object detection and tracking for space situational awareness[J]. IEEE Sensors Journal, 2020, 20(24): 15117−15132 doi: 10.1109/JSEN.2020.3009687

    [31]

    Huang H M, Lin L F, Tong R F, et al. UNet 3+: a full-scale connected UNet for medical image segmentation[C]// Proceedings of ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, 2020: 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405.

    [32]

    Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721.

    [33]

    Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

    [34]

    Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, 2015: 1440–1448. https://doi.org/10.1109/ICCV.2015.169.

    [35]

    Charles R Q, Su H, Kaichun M. PointNet: deep learning on point sets for 3D classification and segmentation[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 77–85. https://doi.org/10.1109/CVPR.2017.16.

    [36]

    Charles R Q, Yi L, Su H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 2017: 5105–5114. https://dl.acm.org/doi/abs/10.5555/3295222.3295263.

    [37]

    Gehrig M, Scaramuzza D. Recurrent vision transformers for object detection with event cameras[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 13884–13893. https://doi.org/10.1109/CVPR52729.2023.01334.

    [38]

    Peng Y S, Li H B, Zhang Y Y, et al. Scene adaptive sparse transformer for event-based object detection[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2024: 16794–16804. https://doi.org/10.1109/CVPR52733.2024.01589.

    [39]

    Li B Y, Xiao C, Wang L G, et al. Dense nested attention network for infrared small target detection[J]. IEEE Trans Image Process, 2023, 32: 1745−1758. doi: 10.1109/TIP.2022.3199107

    [40]

    Dai Y M, Li X, Zhou F, et al. One-stage cascade refinement networks for infrared small target detection[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5000917. doi: 10.1109/TGRS.2023.3243062

  • 加载中

(7)

(3)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2024-09-02
修回日期:  2024-10-12
录用日期:  2024-10-12
刊出日期:  2024-11-25

目录

/

返回文章
返回