請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89663完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李綱 | zh_TW |
| dc.contributor.advisor | Kang Li | en |
| dc.contributor.author | 柯志澔 | zh_TW |
| dc.contributor.author | Zhi-Hao Ke | en |
| dc.date.accessioned | 2023-09-15T16:08:44Z | - |
| dc.date.available | 2023-09-16 | - |
| dc.date.copyright | 2023-09-15 | - |
| dc.date.issued | 2022 | - |
| dc.date.submitted | 2002-01-01 | - |
| dc.identifier.citation | [1] M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang, “DenseASPP for semantic segmentation in street scenes,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3684–3692.
[2] H. Wang, Y. Sun and M. Liu, "Self-Supervised Drivable Area and Road Anomaly Segmentation Using RGB-D Data For Robotic Wheelchairs," in IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4386-4393, Oct. 2019, doi: 10.1109/LRA.2019.2932874. [3] Crespo, J., Castillo, J.C., Mozos, O.M., Barber, R.: Semantic information for robot navigation: a survey. Appl. Sci. 10, 497 (2020) [4] 10 Tampa Bay, Tesla driver, passenger killed after crashing into back of semi-truck off I-75, July, 2022. [5] Q. M. Rahman, P. Corke, and F. Dayoub, “Run-time monitoring of machine learning for robotic perception: A. survey of emerging trends,” IEEE Access, vol. 9, pp. 20067–20075, 2021 [6] Hendrycks, D., Gimpel, K. “A baseline for detecting misclassified and out-of-distribution examples in neural networks.” International Conference on Learning Representations, ICLR. 2017. [7] Geifman, Yonatan, and Ran El-Yaniv. "Selective classification for deep neural networks." arXiv preprint arXiv:1705.08500. 2017. [8] Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift. In Advances in Neural Information Processing Systems, pages 13991–14002, 2019. [9] Corbi`ere, C., Thome, N., Bar-Hen, A., Cord, M., P´erez, P. “Addressing failure prediction by learning model confidence.” Advances in Neural Information Processing Systems. 2019. [10] John S. Denker and Yann LeCun “Transforming Neural-Net Output Levels to Probability Distributions.” Advances in Neural Information Processing Systems. 1991. [11] Alex Graves, “Practical variational inference for neural networks” In Advances in Neural Information Processing Systems, pages 2348–2356, 2011. [12] Gal, Y., Ghahramani, Z. “Bayesian convolutional neural networks with bernoulli approximate variational inference.” arXiv preprint arXiv:1506.02158. 2015. [13] Gal, Y., Ghahramani, Z. “Dropout as a bayesian approximation: Representing model uncertainty in deep learning.” 33rd International Conference on Machine Learning, ICML 2016. vol. 3, pp. 1651–1660. 2016. [14] Alex Kendall, Vijay Badrinarayanan, and Roberto Cipolla. Bayesian Segnet “Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding.” arXiv Preprint, 1511.02680. 2015. [15] Shuya Isobe and Shuichi Arai. “Deep Convolutional EncoderDecoder Network with Model Uncertainty for Semantic Segmentation.” IEEE International Conference on INnovations in Intelligent SysTems and Applications. 2017. [16] Shuya Isobe and Shuichi Arai. “Inference with Model Uncertainty on Indoor Scene for Semantic Segmentation.” IEEE Global Conference on Signal and Information Processing. 2017. [17] Baur, C., Wiestler, B., Albarqouni, S., Navab, N. “Deep autoencoding models for unsupervised anomaly segmentation in brain mr images.” MICCAI Brainlesion Workshop. 2018. [18] Haselmann, M., Gruber, D.P., Tabatabai, P. “Anomaly detection using deep learning based image completion.” International Conference on Machine Learning and Applications, ICMLA. IEEE. 2018. [19] Charles Richter and Nicholas Roy. Safe Visual Navigation via Deep Learning and Novelty Detection. In Robotics: Science and Systems XIII, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, July 12-16, 2017. [20] Lis, K., Nakka, K., Fua, P., Salzmann, M. “Detecting the unexpected via image resynthesis.” Proceedings of the IEEE International Conference on Computer Vision. pp. 2152–2161. 2019. [21] Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, and Alan Yuille. “Synthesize then compare: Detecting failures and anomalies for semantic segmentation.” arXiv preprint arXiv:2003.08440. 2020. [22] 林宗郁, “以生成式對抗網路與選擇性分類技術之實時語意分割錯誤偵測”, in 國立臺灣大學機械工程研究所, 2021 [23] Peng Zhang, Jiuling Wang, Ali Farhadi, Martial Hebert, and Devi Parikh. Predicting Failures of Vision Systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3566–3573, 2014. [24] Shreyansh Daftry, Sam Zeng, J Andrew Bagnell, and Martial Hebert. Introspective Perception: Learning to Predict Failures in Vision Systems. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1743–1750. IEEE, 2016 [25] C. B. Kuhn, M. Hofbauer, S. Lee, G. Petrovic, and E. Steinbach, “Introspective failure prediction for semantic image segmentation,” in Proc. IEEE 23rd Int. Conf. Intell. Transp. Syst., 2020, pp. 1–6 [26] E. Romera, et al., “ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation,” IEEE Transactions on Intelligent Transportation Systems (ITS), pp. 263–272, 2018. [27] Y. Wang, et al., “LEDnet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation,” in IEEE Int. Conference on Image Processing (ICIP), 2019, pp. 1860–1864. [28] G. Li, et al., “DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation,” British Machine Vision Conference (BMVC), 2019. [29] M. Orsiˇ c,´ et al., “In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images,” IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 12 607–12 616, 2019. [30] C. Yu, et al., “BiSeNet: Bilateral segmentation network for real-time semantic segmentation,” in Europ. Conf. on Computer Vision (ECCV), 2018, pp. 325–341. [31] Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [32] Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. arXiv 2016, arXiv:abs/1606.02147. [33] C. Bucilua, R. Caruana, and A. Niculescu-Mizil, “Model compression,” in Proc. 12th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2006, pp. 535–541 [34] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2014, pp. 1–9 [35] S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” in Proc. 5th Int. Conf. Learn. Representations 2017, pp. 1–13 [36] Wang, R. Zhang, Y. Sun, and J. Qi, “KDGAN: Knowledge distillation with generative adversarial networks,” in Proc. 32nd Int. Conf. Neural Inf. Process. Syst., 2018, pp. 775–786 [37] S. I. Mirzadeh, M. Farajtabar, A. Li, N. Levine, A. Matsukawa, and H. Ghasemzadeh, “Improved knowledge distillation via teacher assistant,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 5191–5198 [38] Y. Hou, Z. Ma, C. Liu, and C. C. Loy, “Learning lightweight lane detection cnns by self attention distillation,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1013– 1021. 2, 3, 5 [39] S. An, Q. Liao, Z. Lu and J. -H. Xue, "Efficient Semantic Segmentation via Self-Attention and Self-Distillation," in IEEE Transactions on Intelligent Transportation Systems, doi: 10.1109/TITS.2021.3139001. [40] L. Zhang, C. Bao and K. Ma, "Self-Distillation: Towards Efficient and Compact Neural Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 4388-4403, 1 Aug. 2022, doi: 10.1109/TPAMI.2021.3067100. [41] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. [42] D. Seichter, M. Köhler, B. Lewandowski, T. Wengefeld and H. -M. Gross, "Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis," 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 13525-13531, doi: 10.1109/ICRA48506.2021.9561675. [43] J. Hu, et al., “Squeeze-and-excitation networks,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132– 7141. [44] Lin, Tsung-Yi, et al. "Feature pyramid networks for object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. [45] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [46] J. Alvarez and L. Petersson. (Jun. 2016). “DecomposeMe: Simplifying ConvNets for end-to-end learning.” [Online]. Available: https://arxiv. org/abs/1606.05426 [47] Q. M. Rahman, N. Sünderhauf, P. Corke and F. Dayoub, "FSNet: A Failure Detection Framework for Semantic Segmentation," in IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3030-3037, April 2022, doi: 10.1109/LRA.2022.3143219. [48] S. Ruder, “An overview of multi-task learning in deep neural networks,” 2017, arXiv:1706.05098. [Online]. Available: http://arxiv.org/abs/ 1706.05098 [49] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B. “The cityscapes dataset for semantic urban scene understanding.” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 2016. [50] F. Yu et al., “BDD100 K: A. diverse driving video database with scalable annotation tooling,” 2018, arXiv:1805.04687. [51] Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. [52] M. Rottmann, P. Colling, Thomas-Paul Hack, Fabian Hüger, Peter Schlicht, and H. Gottschalk. Prediction Error Meta Classification in Semantic Segmentation: Detection via Aggregated Dispersion Measures of Softmax Probabilities. 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9, 2020. [53] V. Valindria, I. Lavdas, Wenjia Bai, K. Kamnitsas, E. Aboagye, A. Rockall, D. Rueckert, and Ben Glocker. Reverse Classification Accuracy: Predicting Segmentation Performance in the Absence of Ground Truth. IEEE Transactions on Medical Imaging, 36:1597–1606, 2017. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89663 | - |
| dc.description.abstract | 基於深度學習的語意分割技術日漸成為許多自駕載具感知系統中重要的一環,然而,在面對自駕載具駕駛環境分布多樣且複雜的情況下,語意分割模型在實際推論時不可避免的會有較大效能下降的現象,若無法在運行過程中監控模型分割的狀況,對於自駕載具的安全性來說將是個很大的疑慮。因此,本研究提出一套內省式的高效語意分割錯誤偵測模型,透過觀察語意分割模型失效時所對應到的內部特徵狀態來訓練學習,並在推論階段擁有實時預測分割錯誤的能力,藉由輸出像素級別的信心指數圖,了解模型對於當前環境辨識的狀況。
本研究之內省模型提取語意分割內部的多種特徵來做為輸入,並透過基於通道注意力機制的模組與特徵金字塔的模型架構來幫助模型更有效的融合使用這些特徵。在訓練階段時採用與語意分割模型聯合學習(joint-learning)的方式,使模型從過程中獲得更豐富的潛在特徵,大幅提升了預測的準確性與泛化性。除此之外,在訓練過程中還引入了自注意力蒸餾(Self Attention Distillation),使其在不增加參數的情況下透過編碼器內部淺層與深層的注意力圖互相學習,進一步提升性能表現。最後本研究將此錯誤偵測框架使用在不同的語意分割模型(FCN-8、ESANet),並且都在分布內資料集(Cityscapes)與分布外資料集(BDD100K)做驗證,比較了三項性能指標(AUPR、AUROC、FPR95),均為現今相關研究方法中的最佳。另外,本研究的模型採取輕量化的設計,在單張 NVIDIA RTX 2080Ti 顯示卡上比先前性能表現最好的模型減少了約40%的推論時間,並且在嵌入式電腦(JETSON AGX XAVIER)上達到約29FPS的推論速度。 | zh_TW |
| dc.description.abstract | Semantic segmentation has gradually become a critical component in perception system of self-driving vehicles. However, due to diversity and complexity of driving environment, the semantic segmentation model will inevitably have a large performance drop during deployment. It will be a great concern for the safety and reliability. As a result, we propose an introspective and efficient fault detection model to monitor semantic segmentation model during run-time. The introspective model extracts multiple internal features of segmentation model as input to predict pixel-level confidence map. We utilize a channel attention mechanism module and feature pyramid architecture to help the model integrate input features more effectively. In the training phase, we use joint-learning method to train the two model at the same time, which can enhance the generalization ability. Moreover, without the need of increasing computing resource, a self attention distillation method was introduced to further improve the model performance. In our experiment, we use proposed introspective model to monitor several typical segmentation models, and evaluate on Cityscapes and BDD100k segmentation datasets. The result shows that our approach significantly outperforms all existing methods in three critical performance metrics (AUPR, AUROC, FPR95). In addition, our model adopts a lightweight and efficient design, which reduces the inference time by 40% than the state-of-the-art framework, and the computing speed can achieve 29fps on an embedded computer (NVIDIA JETSON AGX XAVIER). | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-15T16:08:44Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2023-09-15T16:08:44Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員會審定書 i
誌謝 ii 中文摘要 iii ABSTRACT iv CONTENTS v LIST OF FIGURES viii LIST OF TABLES x Chapter 1 緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究成果 3 Chapter 2 文獻回顧 4 2.1 運行時監控(Run-time Monitoring): 4 2.1.1 基於不確定性估測(Uncertainty estimation)的監控: 4 2.1.2 基於分佈外檢測(Out-of-Distribution Detection)的監控: 6 2.1.3 基於過去經驗的監控 6 2.2 高效語意分割(Efficient Semantic Segmentation) 7 2.2.1 分解式卷積(Convolution Factorization) 7 2.2.2 輕量編碼解碼結構(Lightweight Encoder-decoder architecture) 10 2.3 知識蒸餾(Knowledge Distillation) 10 Chapter 3 模型架構與研究方法 13 3.1 整體模型架構 13 3.2 語意分割模型(Semantic Segmentation Model) 14 3.2.1 Fully Convolutional Neural Network(FCN)[41] 14 3.2.2 Efficient Scene Analysis Network(ESANet)[42] 16 3.3 錯誤偵測模型(Fault Detection Model) 17 3.3.1 輸入特徵選擇 18 3.3.2 輕量化編碼解碼架構 21 3.3.3 多模態特徵融合 24 3.3.4 多尺度特徵融合 26 3.4 模型訓練方法 26 3.4.1 聯合學習(Joint-Learning) 27 3.4.2 自注意力蒸餾(Self Attention Distillation) 29 3.5 模型評估指標 31 3.5.1 混淆矩陣(Confusion Matrix) 32 3.5.2 AUROC(Area Under ROC curve) 33 3.5.3 FPR95(False Positive Rate at 95% true positive rate) 35 3.5.4 AUPR(Area Under PR curve) 36 3.5.5 Risk-Coverage curve 37 Chapter 4 實驗結果分析與討論 39 4.1 實驗設置 39 4.1.1 資料集介紹 39 4.1.2 監控模型選擇與訓練流程 41 4.1.3 比較對象 42 4.1.4 軟硬體環境配置 42 4.2 模型性能量化驗證 42 4.2.1 分布內資料集(In-Distribution Dataset)驗證 42 4.2.2 分布外資料集(Out-of-Distribution Dataset)驗證 44 4.2.3 Risk-Coverage curve 45 4.2.4 模型運算量與推論速度 46 4.3 錯誤偵測輸出視覺化比較 49 4.4 消融實驗(Ablation Study) 51 4.4.1 模型輸入特徵的影響 51 4.4.2 聯合學習的影響 53 4.4.3 自注意力蒸餾(Self Attention Distillation, SAD)的影響 55 Chapter 5 結論與未來建議 58 5.1 結論 58 5.2 未來建議 58 REFERENCE 60 | - |
| dc.language.iso | zh_TW | - |
| dc.subject | 語意分割 | zh_TW |
| dc.subject | 知識蒸餾 | zh_TW |
| dc.subject | 內省性 | zh_TW |
| dc.subject | 異常偵測 | zh_TW |
| dc.subject | 錯誤偵測 | zh_TW |
| dc.subject | 影像注意力機制 | zh_TW |
| dc.subject | image attention mechanism | en |
| dc.subject | semantic segmentation | en |
| dc.subject | fault detection | en |
| dc.subject | anomaly detection | en |
| dc.subject | introspection | en |
| dc.subject | introspection | en |
| dc.subject | knowledge distillation | en |
| dc.title | 內省式高效語意分割錯誤偵測方法 | zh_TW |
| dc.title | Introspective and Efficient Fault Detection for Semantic Segmentation | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 110-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 李宇修 | zh_TW |
| dc.contributor.oralexamcommittee | Ching-Yao Chan;Hong-Tsu Young;Yu-Hsiu Lee | en |
| dc.subject.keyword | 語意分割,錯誤偵測,異常偵測,內省性,影像注意力機制,知識蒸餾, | zh_TW |
| dc.subject.keyword | semantic segmentation,fault detection,anomaly detection,introspection,introspection,image attention mechanism,knowledge distillation, | en |
| dc.relation.page | 64 | - |
| dc.identifier.doi | 10.6342/NTU202203738 | - |
| dc.rights.note | 同意授權(限校園內公開) | - |
| dc.date.accepted | 2022-09-23 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 機械工程學系 | - |
| dc.date.embargo-lift | 2024-09-02 | - |
| 顯示於系所單位: | 機械工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-110-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 3.8 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
