兩階段漸進式多重曝光融合：基於中間曝光生成

黃郁夫; Yu-Fu Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101186

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊永裕	zh_TW
dc.contributor.advisor	Yung-Yu Chuang	en
dc.contributor.author	黃郁夫	zh_TW
dc.contributor.author	Yu-Fu Huang	en
dc.date.accessioned	2025-12-31T16:15:02Z	-
dc.date.available	2026-01-01	-
dc.date.copyright	2025-12-31	-
dc.date.issued	2025	-
dc.date.submitted	2025-10-20	-
dc.identifier.citation	[1] J. Cai, S. Gu, and L. Zhang. Learning a deep single image contrast enhancer from multi-exposure images. In IEEE Transactions on Image Processing, volume 27, pages 2049–2062. IEEE, 2018. [2] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam. Understanding convolution for semantic segmentation. arXiv preprint arXiv:1706.05587, 2017. [3] D. Han, L. Li, X. Guo, and J. Ma. Multi-exposure image fusion via deep perceptual enhancement. Information Fusion, 79:248–262, 2022. [4] Y. Han, Y. Cai, Y. Cao, and X. Xu. A new image fusion performance metric based on visual information fidelity. Information Fusion, 14(2):127–135, 2013. [5] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016. [6] S. Li and X. Kang. Image fusion with guided filtering. IEEE Transactions on Image Processing, 22(7):2864–2875, 2013. [7] H. Liu, J. Ma, H. Xu, J. Liu, and X. Zhang. HoLoCo: Holistic and local contrastive learning network for multi-exposure image fusion. Information Fusion, 95:237–249, 2023. [8] Z. Liu, J. Liu, G. Wu, Z. Chen, X. Fan, and R. Liu. Searching a compact architecture for robust multi-exposure image fusion. IEEE Transactions on Circuits and Systems for Video Technology, 34(7):6224–6237, 2024. [9] K. Ma, H. Li, H. Yong, Z. Wang, D. Meng, and L. Zhang. Robust multi-exposure image fusion: A structural patch decomposition approach. IEEE Transactions on Image Processing, 26(5):2519–2532, 2017. [10] T. Mertens, J. Kautz, and F. Van Reeth. Exposure fusion. In Pacific Conference on Computer Graphics and Applications, pages 382–390. IEEE, 2007. [11] K. R. Prabhakar, V. S. Srikar, and R. V. Babu. DeepFuse: A deep unsupervised approach for exposure fusion with extreme exposure image pairs. In IEEE International Conference on Computer Vision (ICCV), pages 4714–4722, 2017. [12] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [13] X. Tan, H. Chen, R. Zhang, Q. Wang, Y. Kan, J. Zheng, Y. Jin, and E. Chen. Deep multi-exposure image fusion for dynamic scenes. IEEE Transactions on Image Processing, 32:5310–5325, 2023. [14] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. [15] Z. Wang, E. P. Simoncelli, and A. C. Bovik. Multiscale structural similarity for image quality assessment. Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, 2:1398–1402, 2003. [16] G. Wu, H. Fu, J. Liu, L. Ma, X. Fan, and R. Liu. Hybrid-supervised dualsearch: Leveraging automatic learning for loss-free multi-exposure image fusion. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 5985–5993, 2024. [17] Q. Yan, D. Gong, Q. Shi, A. van den Hengel, C. Shen, I. Reid, and Y. Zhang. Attention-guided network for ghost-free high dynamic range imaging. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1751–1760, 2019.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101186	-
dc.description.abstract	多重曝光影像融合（MEF）面臨一項根本性的挑戰：現有使用 UNet 或注意力機制的方法必須直接處理亮度差距巨大的極端曝光影像對，這限制了融合品質，特別是在局部區域。本文提出一項新的觀察：當比較的曝光影像之間的相對差異較小時，注意力機制能計算出更可靠的權重。我們的雙階段框架採用「由全域到局部」的漸進式策略。第一階段（UNet）在網路瓶頸處進行全域特徵融合，將極端曝光的深層特徵合併，以生成亮度均衡的中間曝光影像。此中間影像具有雙重功能：(1) 減少曝光差距以改善注意力權重計算，(2) 作為距離兩個極端曝光等距的參考錨點。第二階段（AMNet）利用該中間影像進行基於注意力的局部細化，在較小的曝光差異下，能更精確地進行特徵比對與選擇性細節恢復。此架構的分工源於對 MEF 誤差的分析：大多數偽影來自局部曝光失敗，而非全域亮度不平衡。因此，UNet 負責全域協調，而注意力模組則專注於局部增強。我們的解耦式訓練確保每個階段都能針對其特定目標進行最佳化。實驗驗證了我們的理論：中間曝光顯著提升了注意力權重的準確性，在維持全域一致性的同時，於困難區域中達成更優秀的細節保留。	zh_TW
dc.description.abstract	Multi-exposure image fusion (MEF) faces a fundamental challenge: existing methods using UNet or attention mechanisms must directly process extreme exposure pairs with substantial luminance gaps, limiting fusion quality particularly in local regions. We present a novel insight that attention mechanisms compute more reliable weights when comparing exposures with reduced relative differences. Our two-phase framework implements a global-to-local progressive strategy. Phase one (UNet) performs global feature fusion at the network bottleneck, merging deep representations from extreme exposures to generate an intermediate exposure with balanced luminance. This intermediate serves dual purposes: (1) reducing exposure gaps for improved attention computation, and (2) providing a reference anchor equidistant from both extremes. Phase two (AMNet) exploits this intermediate for attention-based local refinement, where smaller exposure differences enable precise feature comparison and selective detail recovery. The architecture division follows MEF error analysis: most artifacts appear as local exposure failures rather than global imbalance. Thus, UNet handles global harmonization while attention targets local enhancement. Our decoupled training ensures each phase optimizes its specific objective. Experiments validate our theory: the intermediate exposure significantly improves attention weight accuracy, leading to superior detail preservation in challenging regions while maintaining global consistency.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-12-31T16:15:02Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-12-31T16:15:02Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Denotation xvii Chapter 1 Introduction 1 1.1 HDR and LDR Fundamentals . . . . . . . . . . . . . . . . . . . . . 1 1.2 HDR vs. MEF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Evolution of MEF Methods . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Related Work 5 2.1 Traditional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Deep Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Automated Architecture and Loss Design . . . . . . . . . . . . . . 6 2.1.3 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Methodology 9 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Phase One: UNet for Intermediate Exposure Generation . . . . . . . 10 3.2.1 Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.2 Network Components . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.3 Training Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Phase Two: AMNet for Attention-based Refinement . . . . . . . . . 13 3.3.1 Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3.2 Attention-guided Fusion . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.3 Training Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4.1 L1 Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4.2 VGG Perceptual Loss . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.4.3 SSIM Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.4.4 Loss Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Chapter 4 Experiments 19 4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1 Network Configuration . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.2 Training Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 4.2.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3 Comparison with State-of-the-Art Methods . . . . . . . . . . . . . . 21 4.3.1 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.2 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 5 Conclusion 27 References 29	-
dc.language.iso	en	-
dc.subject	電腦視覺	-
dc.subject	多重曝光合成	-
dc.subject	高動態範圍影像	-
dc.subject	Computer Vision	-
dc.subject	Multiple Exposure Fusion	-
dc.subject	HDR	-
dc.title	兩階段漸進式多重曝光融合：基於中間曝光生成	zh_TW
dc.title	Two-Phase Progressive Multiple Exposure Fusion via Intermediate Exposure Generation	en
dc.type	Thesis	-
dc.date.schoolyear	114-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳賦哲;葉正聖	zh_TW
dc.contributor.oralexamcommittee	Fu-Che Wu;Jeng-Sheng Yeh	en
dc.subject.keyword	電腦視覺,多重曝光合成高動態範圍影像	zh_TW
dc.subject.keyword	Computer Vision,Multiple Exposure FusionHDR	en
dc.relation.page	31	-
dc.identifier.doi	10.6342/NTU202504549	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-10-21	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2026-01-01	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf	13.42 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。