Abstract:As a new generation of the imaging device, light-field camera can simultaneously capture the spatial position and incident angle of light rays. However, the recorded light-field has a trade-off between spatial resolution and angular resolution. Especially the application range of light-field cameras is restricted by the limited spatial resolution of sub-aperture images. Therefore, a light-field super-resolution neural network that fuses multi-scale features to obtain super-resolved light-field is proposed in this paper. The deep-learning-based network framework contains three major modules: multi-scale feature extraction, global feature fusion, and up-sampling. Firstly, inherent structural features in the 4D light-field are learned through the multi-scale feature extraction module, and then the fusion module is exploited for feature fusion and enhancement. Finally, the up-sampling module is used to achieve light-field super-resolution. The experimental results on the synthetic light-field dataset and real-world light-field dataset showed that this method outperforms other state-of-the-art methods in both visual and numerical evaluations. In addition, the super-resolved light-field images were applied to depth estimation in this paper, the results illustrated that the disparity map was enhanced through the light-field spatial super-resolution.
Key words:
- super-resolution /
- light-field /
- deep learning /
- multi-scale feature extraction /
- feature fusion
Overview: As a new generation of imaging equipment, a light-field camera can simultaneously capture the spatial position and incident angle of light rays. However, the recorded light-field has a trade-off between spatial resolution and angular resolution. Especially the limited spatial resolution of sub-aperture images limits the application scenarios of light-field cameras. Therefore, a light-field super-resolution network that fuses multi-scale features to obtain super-resolved light-field is proposed in this paper. The deep-learning-based network framework contains three major modules: multi-scale feature extraction module, global feature fusion module, and up-sampling module. The design ideas of different modules are as follows.
a) Multi-scale feature extraction module: To explore the complex texture information in the 4D light-field space, the feature extraction module uses ResASPP blocks to expand the perception field and to extract multi-scale features. The low-resolution light-field sub-aperture images are first sent to a Conv block and a Res block for low level feature extraction, and then a ResASPP block and a Res block are alternated twice to learn multi-scale features that accumulate high-frequency information in the 4D light-field.
b) Global feature fusion module: The light-field images contain not only spatial information but also angular information, which implies inherent structures of 4D light-field. The global feature fusion module is proposed to geometrically reconstruct the super-resolved light-field by exploiting the angular clues. It should be noted that the feature maps of all the sub-images from the upstream are first stacked in the channel dimension of the network and then are sent to this module for high-level features extraction.
c) Up-sampling module: After learning the global features in the 4D light-field structure, the high-level feature maps could be sent to the up-sampling module for light-field super resolution. This module uses sub-pixel convolution or pixel shuffle operation to obtain 2 spatial super-resolution, after feature maps are sent to a conventional convolution layer to perform feature fusion and finally output a super-resolved light-field sub-images array.
The network proposed in this paper was applied to the synthetic light-field dataset and the real-world light-field dataset for light-field images super-resolution. The experimental results on the synthetic light-field dataset and real-world light-field dataset showed that this method outperforms other state-of-the-art methods in both visual and numerical evaluations. In addition, the super-resolved light-field images were applied to depth estimation, and the results illustrated the parallax calculation enhancement of light-field spatial super-resolution, especially in occlusion and edge regions.
表 1 不同超分辨算法在合成数据上的性能比较
Table 1. Performance comparison of different image super resolution algorithms on synthetic data
Method Buddha Mona Papillon PSNR/dB SSIM PSNR/dB SSIM PSNR/dB SSIM Bicubic 33.0865 0.9208 32.6579 0.9301 33.4031 0.9365 GBSR 35.7463 0.9568 38.1479 0.9769 38.7855 0.9802 FALSR 34.9493 0.9373 34.8104 0.9412 34.7569 0.9504 ResLF 35.4988 0.9689 34.3314 0.9614 35.1983 0.9754 Proposed 39.8095 0.9807 41.5483 0.9865 41.0616 0.9852 表 2 不同超分辨算法在真实数据上的性能对比
Table 2. Performance comparison of different image super resolution algorithms on real-world data
Method Fence Cars Flowers PSNR/dB SSIM PSNR/dB SSIM PSNR/dB SSIM Bicubic 30.8720 0.9541 31.3657 0.9401 30.2619 0.9194 FALSR 35.1476 0.9639 31.5821 0.9422 31.5795 0.9192 ResLF 34.9172 0.9844 31.6191 0.9722 31.3748 0.9538 Proposed 31.5522 0.9816 35.2929 0.9800 40.6967 0.9874 -
[1] Lippmann G. Épreuves réversibles donnant la sensation du relief[J]. Journal de Physique Théorique et Appliquée, 1908, 7(1): 821-825. doi: 10.1051/jphystap:019080070082100
[2] Adelson E H, Wang J Y A. Single lens stereo with a plenoptic camera[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2): 99-106. doi: 10.1109/34.121783
[3] Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[R]. Stanford Tech Report CTSR 2005-02, 2005.
[4] Tan Z P, Johnson K, Clifford C, et al. Development of a modular, high-speed plenoptic-camera for 3D flow-measurement[J]. Optics Express, 2019, 27(9): 13400-13415. doi: 10.1364/OE.27.013400
[5] Fahringer T W, Lynch K P, Thurow B S. Volumetric particle image velocimetry with a single plenoptic camera[J]. Measurement Science and Technology, 2015, 26(11): 115201. doi: 10.1088/0957-0233/26/11/115201
[6] Shi S X, Ding J F, New T H, et al. Volumetric calibration enhancements for single-camera light-field PIV[J]. Experiments in Fluids, 2019, 60(1): 21. doi: 10.1007/s00348-018-2670-5
[7] Shi S X, Ding J F, New T H, et al. Light-field camera-based 3D volumetric particle image velocimetry with dense ray tracing reconstruction technique[J]. Experiments in Fluids, 2017, 58(7): 78. doi: 10.1007/s00348-017-2365-3
[8] Shi S X, Wang J H, Ding J F, et al. Parametric study on light field volumetric particle image velocimetry[J]. Flow Measurement and Instrumentation, 2016, 49: 70-88. doi: 10.1016/j.flowmeasinst.2016.05.006
[9] Sun J, Xu C L, Zhang B, et al. Three-dimensional temperature field measurement of flame using a single light field camera[J]. Optics Express, 2016, 24(2): 1118-1132. doi: 10.1364/OE.24.001118
[10] Shi S X, Xu S M, Zhao Z, et al. 3D surface pressure measurement with single light-field camera and pressure-sensitive paint[J]. Experiments in Fluids, 2018, 59(5): 79. doi: 10.1007/s00348-018-2534-z
[11] Ding J F, Li H T, Ma H X, et al. A novel light field imaging based 3D geometry measurement technique for turbomachinery blades[J]. Measurement Science and Technology, 2019, 30(11): 115901. doi: 10.1088/1361-6501/ab310b
[12] Cheng Z, Xiong Z W, Chen C, et al. Light field super-resolution: a benchmark[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, 2019.
[13] Lim J, Ok H, Park B, et al. Improving the spatail resolution based on 4D light field data[C]//Proceedings of the 16th IEEE International Conference on Image Processing, Cairo, Egypt, 2009, 2: 1173-1176.
[14] Georgiev T, Chunev G, Lumsdaine A. Superresolution with the focused plenoptic camera[J]. Proceedings of SPIE, 2011, 7873: 78730X. doi: 10.1117/12.872666
[15] Bishop T E, Favaro P. The light field camera: extended depth of field, aliasing, and superresolution[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 972-986. doi: 10.1109/TPAMI.2011.168
[16] Rossi M, Frossard P. Graph-based light field super-resolution[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6.
[17] Alain M, Smolic A. Light field super-resolution via LFBM5D sparse coding[C]//Proceedings of the 25th IEEE International Conference on Image Processing, Athens, Greece, 2018: 1-5.
[18] Egiazarian K, Katkovnik V. Single image super-resolution via BM3D sparse coding[C]//Proceedings of the 23rd European Signal Processing Conference, Nice, France, 2015: 2849-2853.
[19] Alain M, Smolic A. Light field denoising by sparse 5D transform domain collaborative filtering[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6.
[20] Yoon Y, Jeon H G, Yoo D, et al. Learning a deep convolutional network for light-field image super-resolution[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 57-65.
[21] Wang Y L, Liu F, Zhang K B, et al. LFNet: a novel bidirectional recurrent convolutional neural network for light-field image super-resolution[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4274-4286. doi: 10.1109/TIP.2018.2834819
[22] Zhang S, Lin Y F, Sheng H. Residual networks for light field image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 11046-11055.
[23] Wang L G, Wang Y Q, Liang Z F, et al. Learning parallax attention for stereo image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 12250-12259.
[24] Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision, Glasgow, United Kingdom, 2018: 801-818.
[25] 汪荣贵, 刘雷雷, 杨娟, 等.基于聚类和协同表示的超分辨率重建[J].光电工程, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537
Wang R G, Liu L L, Yang J, et al. Image super-resolution based on clustering and collaborative representation[J]. Opto-Electronic Engineering, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537
[26] Shi W Z, Caballero J, Huszár F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1874-1883.
[27] 徐亮, 符冉迪, 金炜, 等.基于多尺度特征损失函数的图像超分辨率重建[J].光电工程, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419
Xu L, Fu R D, Jin W, et al. Image super-resolution reconstruction based on multi-scale feature loss function[J]. Opto-Electronic Engineering, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419
[28] Wanner S, Meister S, Goldluecke B. Datasets and benchmarks for densely sampled 4D light fields[M]//Bronstein M, Favre J, Hormann K. Vision, Modeling & Visualization, Lugano, Switzerland: The Eurographics Association, 2013: 225-226.
[29] Honauer K, Johannsen O, Kondermann D, et al. A dataset and evaluation methodology for depth estimation on 4D light fields[C]//Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan, China, 2016: 19-34.
[30] Raj S A, Lowney M, Shah R, et al. Stanford lytro light field archive[EB/OL]. http://lightfields.stanford.edu/LF2016.html. 2016.
[31] Rerabek M, Ebrahimi T. New light field image dataset[C]//Proceedings of the 8th International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 2016.
[32] Chu X X, Zhang B, Ma H L, et al. Fast, accurate and lightweight super-resolution with neural architecture search[Z]. arXiv: 1901.07261, 2019.
[33] Kingma D P, Ba L J. Adam: a method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations, San Diego, America, 2015.
[34] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010: 249-256.
[35] Bevilacqua M, Roumy A, Guillemot C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//British Machine Vision Conference, Guildford, UK, 2012.
[36] Chen J, Hou J H, Ni Y, et al. Accurate light field depth estimation with superpixel regularization over partially occluded regions[J]. IEEE Transactions on Image Processing, 2018, 27(10): 4889-4900. doi: 10.1109/TIP.2018.2839524