請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/16644完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 藍俊宏(Jakey Blue) | |
| dc.contributor.author | Ya-Lan Yang | en |
| dc.contributor.author | 楊雅嵐 | zh_TW |
| dc.date.accessioned | 2021-06-07T23:42:44Z | - |
| dc.date.copyright | 2020-08-26 | |
| dc.date.issued | 2020 | |
| dc.date.submitted | 2020-08-11 | |
| dc.identifier.citation | [1] Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K. R., Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS ONE, 10(7). [2] Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Mueller, K. R. (2010). How to explain individual classification decisions. The Journal of Machine Learning Research, 11, 1803-1831. [3] Brendel, W., Bethge, M. (2019). Approximating CNNs with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:1904.00760. [4] Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V. N. (2017). Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. arXiv preprint arXiv:1710.11063. [5] Cooper G.F., Herskovits E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4), 309-347. [6] Heckerman, D. (2009). A tutorial on learning with Bayesian networks. In Holmes D.E., Jain L.C. (Eds.), Innovations in Bayesian networks (pp. 33-82). Springer. https://doi.org/10.1007/978-3-540-85066-3_3 [7] He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, USA, 770-778. http://doi.org/10.1109/cvpr.2016.90 [8] Kindermans, P. J., Schütt, K. T., Alber, M., Müller, K. R., Erhan, D., Kim, B., Dähne, S. (2017). Learning how to explain neural networks: PatternNet and PatternAttribution. arXiv preprint arXiv:1705.05598. [9] Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, USA, 1, 1097-1105. [10] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 86(11), 2278-2324. https://doi.org/10.1109/5.726791 [11] Margaritis, D. (2003). Learning Bayesian network model structure from data [Unpublished doctoral dissertation]. Carnegie-Mellon University Pittsburgh Pa School of Computer Science. [12] Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K. R. (2017). Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recognition, 65, 211-222. [13] Ribeiro, M. T., Singh, S., Guestrin, C. (2016, August). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, USA, 1(16), 1135-1144. http://doi.org/10.18653/v1/N16-3020 [14] Ribeiro, M. T., Singh, S., Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16982 [15] Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K. R. (2016). Evaluating the visualization of what a deep neural network has learned. IEEE Transactions on Neural Networks and Learning Systems, 28(11), 2660-2673. [16] Samek, W., Wiegand, T., Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296. [17] Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461-464. [18] Scutari, M. (2009). Learning Bayesian networks with the bnlearn R package. arXiv preprint arXiv:0908.3817. [19] Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, Italy, 618-626. https://doi.org/10.1109/ICCV.2017.74 [20] Shrikumar, A., Greenside, P., Kundaje, A. (2017). Learning important features through propagating activation differences. arXiv preprint arXiv: 1704.02685. [21] Simonyan, K., Vedaldi, A., Zisserman, A. (2013). Deep inside convolutional networks: Visualizing image classification models and saliency maps. arXiv preprint arXiv:1312.6034. [22] Simonyan, K., Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. [23] Spirtes, P., Glymour, C., Scheines, R. (1993). Causation, prediction, and search. Springer. https://doi.org/10.1007/978-1-4612-2748-9 [24] Springenberg, J. T., Dosovitskiy, A., Brox, T., Riedmiller, M. (2014). Striving for simplicity: The all convolutional net. arXiv preprint arXiv: 1412.6806. [25] Sundararajan M., Taly A., Yan Q. (2017). Axiomatic attribution for deep networks. arXiv preprint arXiv: 1703.01365. [26] Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A. A. (2016). Inception-v4, inception-ResNet and the impact of residual connections on learning. arXiv preprint arXiv: 1602.07261 [27] Turek, M. (2018). Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency. https://www.darpa.mil/program/explainable-artificial-intelligence. [28] Wagner, M. (2019, May 7). Bias in AI: Impact on safety. EETimes. https://www.eetimes.com/bias-in-ai-impact-on-safety/?fbclid=IwAR1at2-bAN_LN8B513yKq7RY-SqriZ89ULSskUMD8NWjy_NmOCofXDBP45k. [29] Zeiler M.D., Fergus R. (Eds.). (2014). Lecture notes in computer science: Vol. 8689. Visualizing and Understanding Convolutional Networks. Springer. https://doi.org/10.1007/978-3-319-10590-1_53 [30] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2921-2929. https://doi.org/10.1109/CVPR.2016.319 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/16644 | - |
| dc.description.abstract | 隨著深度神經網路模型演算法的突破,加上軟硬體計算力不斷地提升,使得過去訓練類神經網路的瓶頸得以跨越,因而催生了大規模且效果顯著的研究與產業應用。然而,給定一訓練良好、能輸出準確預判的深度神經網路,由於深度神經網路在經過隱藏層之間相連權重的計算經過線性與非線性的轉換,人們往往無法得知模型中參數的意義與判斷依據,因此推理過程往往為人詬病為一黑盒子,致使模型的可信度與被採用度大大降低,且在許多應用場景中,輸入層的變數對使用者具有確切的物理意義與解釋能力,回溯至輸入層並找出重要因素乃研究者注重的目標之一。 本研究中,將先回顧深度神經網路可釋性當前發展,並利用貝氏網路模型來解析已訓練完成的深度學習模型,利用層層剝離推論的方式,發展由輸出層往隱藏層、最後回推至輸入層的解析框架。接著分別以MNIST以及Fashion MNIST資料集訓練類神經網路模型,最後以視覺化的方式呈現經由貝氏網路演算法推理出的重要特徵 (即圖片上的像素),藉此解釋出黑盒子中的判斷邏輯,找出顯著影響模型的輸入變數,使深度學習的可信度與可釋性增加,提升產業採行深度神經網路技術的信心。 | zh_TW |
| dc.description.abstract | The breakthroughs in artificial neural network algorithms, coupled with the increasing computing power of hardware and software, have made it possible to train the Deep Neural Nets (DNNs). The conventional bottleneck, i.e., vanishing gradients, has been surmounted, leading to large scale and active research and industrial applications. However, given a well-trained DNN that can output accurate predictions, it is not easy to reason the extent to which a deep neural network will pass between the hidden layers. The nonlinear transformation embedded in the crisscross weights makes it almost impossible to know the basis of judgment. The reasoning process is often criticized as a black box, which severely reduces the credibility of the model and its adoption in many application scenarios. The variables in the input layer usually have precise physical meaning and interpretation to the users. In this thesis, the Bayesian network model is used to parse the information propagated within the DNNs. The layer-to-layer Bayesian models are built and integrated as an analytical framework from the output layer, through the hidden layers, and finally to the input layer. MNIST and Fashion MNIST datasets are employed to validate the proposed framework, and the black box is explained by presenting essential features, i.e., the pixels of the image, in a visualized way. In summary, an explainable model can increase the interpretability and enhance the confidence of industrial DNN applications. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-07T23:42:44Z (GMT). No. of bitstreams: 1 U0001-1008202018522800.pdf: 6022829 bytes, checksum: 085eb1d7543be75ddb420ff459479d9f (MD5) Previous issue date: 2020 | en |
| dc.description.tableofcontents | 致謝 i 摘要 ii ABSTRACT iii 目錄 iv 圖目錄 vi 表目錄 x Chapter 1 緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究架構 4 Chapter 2 文獻回顧 6 2.1 卷積神經網路 6 2.2 人工智慧可釋性發展 11 2.2.1 深度解釋 12 2.2.2 模型歸納 19 2.3 貝氏網路理論 23 2.3.1 貝氏網路參數學習 25 2.3.2 貝氏網路架構學習 26 Chapter 3 發展貝氏網路推論AI模型可釋性的方法 29 3.1 貝氏網路之連續型變數預處理 30 3.2 貝氏網路層層推論 30 3.2.1 建立貝氏網路 30 3.2.2 全連接層間黑名單設置 31 3.2.3 卷積層間黑名單設置 31 3.2.4 輸入層重要變數推論 33 Chapter 4 案例分析與討論 34 4.1 資料集與實驗架構設計 34 4.2 資料前處理 36 4.3 應用貝氏網路解釋CNN模型 38 4.3.1 MNIST與Fashion MNIST結果分析 38 4.3.2 探討貝氏網路解析法對於卷積特徵圖之影響 39 4.3.3 現有方法比較討論 51 Chapter 5 結論與未來研究建議 55 5.1 結論 55 5.2 未來研究方向 55 參考文獻 57 | |
| dc.language.iso | zh-TW | |
| dc.subject | 深度神經網路 | zh_TW |
| dc.subject | 貝氏網路 | zh_TW |
| dc.subject | 貝氏推論 | zh_TW |
| dc.subject | 模型可釋性 | zh_TW |
| dc.subject | Explainable AI (XAI) | en |
| dc.subject | Bayesian Inference | en |
| dc.subject | Bayesian Network | en |
| dc.subject | Deep Neural Net | en |
| dc.title | 以貝氏網路解析與提升深度神經網路模型之可釋性 | zh_TW |
| dc.title | Bayesian Network-based Interpretability on the Deep Neural Nets | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 108-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳正剛(Argon Chen),許嘉裕(Chia-Yu Hsu) | |
| dc.subject.keyword | 模型可釋性,深度神經網路,貝氏推論,貝氏網路, | zh_TW |
| dc.subject.keyword | Explainable AI (XAI),Deep Neural Net,Bayesian Inference,Bayesian Network, | en |
| dc.relation.page | 59 | |
| dc.identifier.doi | 10.6342/NTU202002859 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2020-08-12 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 工業工程學研究所 | zh_TW |
| 顯示於系所單位: | 工業工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1008202018522800.pdf 未授權公開取用 | 5.88 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
