Visual perception based rate distortion optimization method for high dynamic range video coding
-
摘要
针对高动态范围(HDR)视频较之于传统低动态范围(LDR)视频所需存储资源和传输带宽急剧增加的问题,本文提出了一种基于视觉感知特性的HDR视频编码的动态率失真优化算法,以提高高效视频编码(HEVC) Main 10编码HDR视频的性能。本文通过引入视觉选择性关注信息,对不同区域采取非均等的失真权重分配策略,优化常规的失真计算方法;同时,为了进一步去除视频中的感知冗余,融合视频内容的纹理特性自适应调节拉格朗日乘子,并应用于编码量化器动态调节量化参数,实现编码比特和失真感知权衡。实验结果表明:与HEVC Main 10相比,在相同HDR-VDP和PSNR DE质量指标下,所提算法平均节省7.46%和6.53%码率,最大分别节省18.52%和11.49%,所提算法在保持视觉质量的前提下能够有效降低码率。
Abstract
In view of the drastic increase of storage resources and transmission bandwidth requirement for high dynamic range (HDR) video compared to the traditional low dynamic range (LDR) video, we propose a dynamic rate distortion optimization algorithm based on visual perception for HDR Video encoding to improve the performance of high efficiency video coding (HEVC) Main 10 for coding HDR video. With the information of visual selective attention, we design a non-uniform distortion weight distribution strategy to different regions of interest and improve the conventional method of distortion calculation. At the same time, in order to further eliminate the perceptive redundancy in HDR video coding, the texture characteristics of video content are used to adjust Lagrange multipliers adaptively, which is applied to the encoder to dynamically adjust the quantization parameters to realize reasonably the trade-off between coded bits and distortion perception. The experimental results show that the proposed algorithm can save an average of 7.46% and 6.53% bitrate with the same HDR-visible difference predictor-2.2(HDR-VDP-2.2) and PSNR_DE compared with HEVC Main 10, saving the maximum of 18.52 % and 11.49% respectively. The proposed algorithm can effectively reduce the consumption of the overall bitrates and still maintain the visual quality of the reconstructed HDR video.
-
Overview
Overview: In view of the drastic increase of storage resources and transmission bandwidth requirement for high dynamic range (HDR) video compared to the traditional low dynamic range (LDR) video, we propose a new dynamic rate distortion optimization algorithm based on visual perception for HDR video encoding to improve the performance of high efficiency video coding (HEVC) Main 10, in which visual attention and texture masking properties of HDR video content are used into HDR video coding. Firstly, the visual saliency map is acquired for the current input HDR video frame. With the information of visual selective attention, we design a non-uniform distortion weight distribution strategy to different regions of interest and improve the conventional method of distortion calculation, which makes the measurement of distortion more in line with human visual system. At the same time, we also take the characteristics of human visual system into account to HDR video coding, such as that human visual system is also very sensitive to distortion in flat areas that are not easily noticeable to the observer, and can tolerate more distortions in areas with complex texture in salient areas. In order to further eliminate the perceived redundancy in HDR video coding, a bilateral filter is used to separate the texture components of the input video frame from which we can extract the texture characteristics to adjust the Lagrange multiplier adaptively. Then, the rate distortion cost function incorporated visual perception is calculated instead of the original rate distortion cost formula, which is applied to the encoder to dynamically adjust the quantization parameters, so as to realize reasonably the trade-off between coded bits and distortion. In the end, the HDR video rate distortion optimization algorithm based on visual perception is established and applied to the whole coding process, including pattern decision, motion estimation and rate-distortion optimization quantization. The proposed algorithm can make it possible to keep the HDR video quality in line with human visual perception while reducing the bitrates. The experimental results show that the proposed algorithm can save an average of 7.46% and 6.53% bitrate with the same HDR-visible Difference Predictor-2.2 (HDR-VDP-2.2) and PSNR_DE compared with HEVC Main 10, saving the maximum of 18.52 % and 11.49%, respectively. It can be seen from the experimental results and partial enlargement that the proposed algorithm preserves the image details and structure information well and has good coding effects for scenes with large visual saliency and complex texture. The proposed algorithm is more reasonable in coding bit allocation strategy, which can reduce the consumption of the overall bitrates and still maintain the visual quality of the reconstructed HDR video.
-
-
图 8 BalloonFestival序列第27帧图像。(a)原始第27帧图像与局部放大图;(b) HM 16.9重建图像与局部放大图,Q = 53.7123, 5280 bits;(c)本文算法重建图像与局部放大图,Q = 53.864, 4800 bits
Figure 8. The 27th image of BalloonFestival sequence. (a) The original 27th frame of the image with a partial enlargement; (b) The reconstructed image of HM 16.9 and partial enlargement, Q = 53.7123, 5280 bits; (c) The reconstructed image of the proposed algorithm and partial enlargement, Q = 53.864, 4800 bits
表 1 所提算法与对比算法的BD-rate结果
Table 1. BD-rate results of the proposed algorithm and comparative algorithm
Sequences Adaptive PQ [18] Proposed PSNR_DE HDR-VDP-2.2 PSNR_DE HDR-VDP-2.2 BalloonFestival -3.43 - -3.88 -18.52 FireEater2 -6.12 - -11.49 -6.84 Market3 -7.44 - -9.00 -3.16 Tibul2 -5.04 - -1.74 -1.30 Average -5.51 - -6.53 -7.46 -
参考文献
[1] Chalmers A, Debattista K. HDR video past, present and future: a perspective[J]. Signal Processing: Image Communication, 2017, 54: 49–55. doi: 10.1016/j.image.2017.02.003
[2] Hulusic V, Debattista K, Valenzise G, et al. A model of perceived dynamic range for HDR images[J]. Signal Processing: Image Communication, 2017, 51: 26–39. doi: 10.1016/j.image.2016.11.005
[3] Lin Y T, Wang C M, Chen W S, et al. A novel data hiding algorithm for high dynamic range images[J]. IEEE Transactions on Multimedia, 2017, 19(1): 196–211. doi: 10.1109/TMM.2016.2605499
[4] Yang Y, Wang X, Liu Q, et al. A bundled-optimization model of multiview dense depth map synthesis for dynamic scene reconstruction[J]. Information Sciences, 2015, 320: 306–319. doi: 10.1016/j.ins.2014.11.014
[5] Yang Y, Liu Q, Liu H, et al. Dense depth image synthesis via energy minimization for three-dimensional video[J]. Signal Processing, 2015, 112: 199–208. doi: 10.1016/j.sigpro.2014.07.020
[6] Yang Y, Deng H P, WU J, et al. Depth map reconstruction and rectification through coding parameters for mobile 3D video system[J]. Neurocomputing, 2015, 151: 663–673. doi: 10.1016/j.neucom.2014.04.088
[7] LIU Q, Yang Y, Ji R R, et al. Cross-view down/up-sampling method for multiview depth video coding[J]. IEEE Signal Processing Letters, 2012, 19(5): 295–298. doi: 10.1109/LSP.2012.2190060
[8] Francois E, Fogg C, He Y W, et al. High dynamic range and wide color gamut video coding in HEVC: status and potential future enhancements[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(1): 63–75. doi: 10.1109/TCSVT.2015.2461911
[9] Kerofsky L, Ye Y, He Y W. Recent developments from MPEG in HDR video compression[C]//Proceedings of 2016 IEEE International Conference on Image Processing (ICIP), 2016: 879–883.
[10] Luthra A, Francois E, Husak W. Call for evidence (CfE) for HDR and WCG video coding[R]. ISO/IEC JTC1/SC29/WG11 MPEG2015/N15083. Geneva, Switzerland: ISO, 2015.
[11] Koz A, Dufaux F. Methods for improving the tone mapping for backward compatible high dynamic range image and video coding[J]. Signal Processing: Image Communication, 2014, 29(2): 274–292. doi: 10.1016/j.image.2013.08.017
[12] Mai Z C, Mansour H, Mantiuk R, et al. Optimizing a tone curve for backward-compatible high dynamic range image and video compression[J]. IEEE Transactions on Image Processing, 2011, 20(6): 1558–1571. doi: 10.1109/TIP.2010.2095866
[13] Zhang Y, Reinhard E, Bull D. Perception-based high dynamic range video compression with optimal bit-depth transformation[C]//Proceedings of the 2011 18th IEEE International Conference on Image Processing (ICIP), 2011: 1321–1324.
[14] Motra A, Thoma H. An adaptive Logluv transform for high dynamic range video compression[C]//Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), 2010: 2061–2064.
[15] Zhang Y, Naccari M, Agrafiotis D, et al. High dynamic range video compression exploiting luminance masking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(5): 950–964. doi: 10.1109/TCSVT.2015.2426552
[16] Miller S, Nezamabadi M, Daly S. Perceptual signal coding for more efficient usage of bit codes[J]. SMPTE Motion Imaging Journal, 2013, 122(4): 52–59. doi: 10.5594/j18290
[17] Barten P G J. Formula for the contrast sensitivity of the human eye[J]. Proceedings of SPIE, 2004, 5294: 231–238. https://www.deepdyve.com/lp/spie/formula-for-the-contrast-sensitivity-of-the-human-eye-XEewYIfnX7
[18] Yu S T, Jung C, KE P. Adaptive PQ: adaptive perceptual quantizer for HEVC main 10 profile-based HDR video coding[C]//Proceedings of 2016 Visual Communications and Image Processing (VCIP), 2016: 1-4.
[19] Zhang Y, Agrafiotis D, Naccari M, et al. Visual masking phenomena with high dynamic range content[C]//Proceedings of the 20th IEEE International Conference on Image Processing (ICIP), 2013: 2284–2288.
[20] Jung C, Lin Q Z, Yu S T. HEVC encoder optimization for HDR video coding based on perceptual block merging[C]//Proceedings of 2016 Visual Communications and Image Processing (VCIP), 2016: 1–4.
[21] Banitalebi-Dehkordi A, Dong Y Y, Pourzazd T M, et al. A learning-based visual saliency fusion model for high dynamic range video (LBVS-HDR)[C]//Proceedings of the 2015 23rd European Signal Processing Conference, 2015: 1541–1545.
[22] Sullivan J G, Ohm J, Han J W, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. doi: 10.1109/TCSVT.2012.2221191
[23] Zhang H X, Lin S W, Xue P. Improved estimation for just-noticeable visual distortion[J]. Signal Processing, 2005, 85(4): 795–808. doi: 10.1016/j.sigpro.2004.12.002
[24] Durand F, Dorsey J. Fast bilateral filtering for the display of high-dynamic-range images[J]. ACM Transactions on Graphics (TOG), 2002, 21(3): 257–266. http://people.csail.mit.edu/fredo/PUBLI/Siggraph2002/DurandBilateral.pdf
[25] Narwaria M, mantiuk R K, Da Silva M P, et al. HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images[J]. Journal of Electronic Imaging, 2015, 24(1): 010501. doi: 10.1117/1.JEI.24.1.010501
[26] Hanhart P, Bernardo M V, Pereira M, et al. Benchmarking of objective quality metrics for HDR image quality assessment[J]. EURASIP Journal on Image and Video Processing, 2015, 2015: 39. doi: 10.1186/s13640-015-0091-4
[27] Azimi M, Banitalebi A, Dong Y, et al. A survey on the performance of the existing full reference HDR video quality metrics: a new HDR video dataset for quality evaluation purposes[C]//Int I. Conf. on Multimedia Signal Processing, 2014.
-
访问统计