利用細節圖和混合損失函數的顯著性偵測

黃煜堯; Yu-Yao Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91328

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁建均	zh_TW
dc.contributor.advisor	Jian-Jiun Ding	en
dc.contributor.author	黃煜堯	zh_TW
dc.contributor.author	Yu-Yao Huang	en
dc.date.accessioned	2023-12-20T16:30:48Z	-
dc.date.available	2023-12-21	-
dc.date.copyright	2023-12-20	-
dc.date.issued	2023	-
dc.date.submitted	2023-11-29	-
dc.identifier.citation	[1]. H. Zhou, X. Xie, J. -H. Lai, Z. Chen and L. Yang, "Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9138-9147, doi: 10.1109/CVPR42600.2020.00916. [2]. J. Wei, S. Wang, Z. Wu, C. Su, Q. Huang and Q. Tian, "Label Decoupling Framework for Salient Object Detection," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 13022-13031, doi: 10.1109/CVPR42600.2020.01304. [3]. Wang, Y., Wang, R., Fan, X., Wang, T., & He, X. (2023). Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10031–10040. [4]. D. -P. Fan, M. -M. Cheng, Y. Liu, T. Li and A. Borji, "Structure-Measure: A New Way to Evaluate Foreground Maps," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017, pp. 4558-4567, doi: 10.1109/ICCV.2017.487. [5]. R. Achanta, S. Hemami, F. Estrada and S. Susstrunk, "Frequency-tuned salient region detection," 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 1597-1604, doi: 10.1109/CVPR.2009.5206596. [6]. M. -M. Cheng, G. -X. Zhang, N. J. Mitra, X. Huang and S. -M. Hu, "Global contrast based salient region detection," CVPR 2011, 2011, pp. 409-416, doi: 10.1109/CVPR.2011.5995344. [7]. H. Jiang, J. Wang, Z. Yuan, T. Liu, and N. Zheng. Automatic salient object segmentation based on context and shape prior. In BMVC, 2011. [8]. D. Batra, A. Kowdle, D. Parikh, J. Luo and T. Chen, "iCoseg: Interactive co-segmentation with intelligent scribble guidance," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3169-3176, doi: 10.1109/CVPR.2010.5540080. [9]. R. Zhao, W. Ouyang, H. Li and X. Wang, "Saliency detection by multi-context deep learning," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1265-1274, doi: 10.1109/CVPR.2015.7298731 [10]. N. Liu and J. Han, "DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 678-686, doi: 10.1109/CVPR.2016.80. [11]. J. Long, E. Shelhamer and T. Darrell, "Fully convolutional networks for semantic segmentation," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431-3440, doi: 10.1109/CVPR.2015.7298965. [12]. Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin, “U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection,” Pattern Recognition 2020, pp. 107404, arXiv:2005.09007 [13]. Yi Ke Yun, Takahiro Tsubono: “Recursive Contour Saliency Blending Network for Accurate Salient Object Detection”, 2021; [http://arxiv.org/abs/2105.13865 arXiv:2105.13865]. [14]. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv, abs/2010.11929. [15]. N. Liu, N. Zhang, K. Wan, L. Shao and J. Han, "Visual Saliency Transformer," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021, pp. 4702-4712, doi: 10.1109/ICCV48922.2021.00468. [16]. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016. [17]. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. [18]. Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan. Learning to detect salient objects with image-level supervision. In CVPR, pages 3796–3805, 2017. [19]. Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. Hierarchical saliency detection. In CVPR, pages 1155–1162, 2013. [20]. Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, and Ming-Hsuan Yang. Saliency detection via graph-based manifold ranking. In CVPR, pages 3166–3173, 2013. [21]. Yin Li, Xiaodi Hou, Christof Koch, James M. Rehg, and Alan L. Yuille. The secrets of salient object segmentation. In CVPR, pages 280–287, 2014. [22]. Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, and Ali Borji. Enhanced-alignment measure for binary foreground map evaluation. In IJCAI, pages 698–704. ijcai.org, 2018. [23]. Jing Zhang, Deng-Ping Fan, Yuchao Dai, Saeed Anwar, Fatemeh Sadat Saleh, Tong Zhang, and Nick Barnes, “Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. [24]. J . Zhao, J. -J. Liu, D. -P. Fan, Y. Cao, J. Yang and M. -M. Cheng, "EGNet: Edge Guidance Network for Salient Object Detection," 2019IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 8778-8787, doi: 10.1109/ICCV.2019.00887. [25]. N. Liu, J. Han and M. -H. Yang, "PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 3089-3098, doi: 10.1109/CVPR.2018.00326. [26]. R. Liu, L. Mi and Z. Chen, "AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation," in IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 9, pp. 7871-7886, Sept. 2021, doi: 10.1109/TGRS.2020.3034123. [27]. X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan and M. Jagersand, "BASNet: Boundary-Aware Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 7471-7481, doi: 10.1109/CVPR.2019.00766. [28]. Z. Wu, L. Su and Q. Huang, "Cascaded Partial Decoder for Fast and Accurate Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3902-3911, doi: 10.1109/CVPR.2019.00403. [29]. Z. Wu, L. Su and Q. Huang, "Cascaded Partial Decoder for Fast and Accurate Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3902-3911, doi: 10.1109/CVPR.2019.00403. [30]. Z. Wu, L. Su and Q. Huang, "Cascaded Partial Decoder for Fast and Accurate Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3902-3911, doi: 10.1109/CVPR.2019.00403. [31]. H. X. Pham, I. Bozcan, A. Sarabakha, S. Haddadin and E. Kayacan, "GateNet: An Efficient Deep Neural Network Architecture for Gate Perception Using Fish-Eye Camera in Autonomous Drone Racing," 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021, pp. 4176-4183, doi: 10.1109/IROS51168.2021.9636207. [32]. J. M. Topple and J. A. Fawcett, "MiNet: Efficient Deep Learning Automatic Target Recognition for Small Autonomous Vehicles," in IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 6, pp. 1014-1018, June 2021, doi: 10.1109/LGRS.2020.2993652. [33]. Chen T, Lin L, Liu L, et al. Disc: Deep image saliency computing via progressive representation learning[J]. IEEE transactions on neural networks and learning systems, 2016, 27(6): 1135-1149. [34]. Chenxi Xie, Changqun Xia, Mingcan Ma, Zhirui Zhao, Xiaowu Chen, Jia Li: “Pyramid Grafting Network for One-Stage High Resolution Saliency Detection”, 2022; [http://arxiv.org/abs/2204.05041 arXiv:2204.05041].	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91328	-
dc.description.abstract	顯著性物件偵測（Salient Object Detection，SOD）在各種電腦視覺應用中扮演著至關重要的前處理步驟，包括視覺追蹤、影像描述、影像分割和影像辨識。本研究旨在通過結合各種技術來提高SOD的準確性。相較於僅使用顯著圖進行監督，我們提出一種不同的角度，以改進交互式雙流解碼器。這包括生成主體圖和細節圖，提供豐富的顯著訊息，來獲得精確的預測。使用者可以微調參數對比來進行客製化。此外，我們引入了一個對應的損失函數，可以分成兩個部分：像素級損失和物件級損失。通過使用這種雙級損失函數，我們可以訓練出更準確的模型，以獲得更精確的預測。最後，我們會在3個資料集上與其他11個目前最流行的SOD演算法去做比較，利用3個既有的指標進行評估。我們的方法都是最佳或是前幾名的成績	zh_TW
dc.description.abstract	Salient Object Detection (SOD) represents a crucial preprocessing step in various computer vision applications, including visual tracking, image captioning, image segmentation, and image recognition. This research aims to enhance the accuracy of SOD by incorporating innovative techniques. Instead of relying solely on saliency maps for supervision, we propose an advanced approach to improve the interactive two-stream decoder. This involves the strategic generation of body maps and detail maps, providing substantial salient information for precise predictions. Users can fine-tune parameter contrasts for customization. Additionally, we introduce a corresponding loss function which can be divided into two part: pixel-level loss and object-level loss. By employing this two-level loss function, we can train a more accurate model to obtain more precise predictions. Compared to state-of-the-art SOD algorithms, our method consistently outperforms competitors across three datasets evaluated with three established metrics.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-12-20T16:30:48Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-12-20T16:30:48Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	摘要 i Abstract ii 目錄 iii 圖目錄 v 表目錄 vi Chapter 1. Introduction 1 Chapter 2. Related Work 5 2.1 Traditional Methods 5 2.2 Neural Networks 9 2.2.1 Convolutional Neural Networks 10 2.2.2 Fully Convolutional Networks 11 2.2.3 Transformers 17 2.2.4 Others 23 Chapter 3. Method 1: Saliency Detection Using Detail Map and Hybrid Loss Function 24 3.1 Interactive Two-Stream Decoder 25 3.2 Body Maps and Detail Maps 26 3.3 Loss Function 28 3.4 Experiments 30 3.4.1 Datasets and Evaluation Metrics 30 3.4.2 Implementation Details 31 3.4.3 Main Results 32 Chapter 4. Method 2: Further Improved 40 4.1 Object-level Loss Function Improvement 40 4.2 Weighted Maps Improvement 41 4.3 Experiments 44 4.3.1 Datasets and Evaluation Metrics 44 4.3.2 Implementation Details 44 4.3.3 Main Results 45 Chapter 5. Conclusion 53 Reference 54	-
dc.language.iso	en	-
dc.title	利用細節圖和混合損失函數的顯著性偵測	zh_TW
dc.title	Saliency Detection Using Detail Map and Hybrid Loss Function	en
dc.type	Thesis	-
dc.date.schoolyear	112-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	王鵬華;余執彰;夏至賢	zh_TW
dc.contributor.oralexamcommittee	Peng-Hua Wang;Chih-Chang Yu;Chih-Hsien Hsia	en
dc.subject.keyword	顯著性物件偵測,主體圖,細節圖,交互式雙流解碼器,雙級損失函數,	zh_TW
dc.subject.keyword	Salient Ojbect Detection,edge map,detail map,interactive two-stream model,two-level loss function,	en
dc.relation.page	59	-
dc.identifier.doi	10.6342/NTU202304451	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2023-11-30	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-112-1.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	1.67 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。