基於大型語言模型與知識檢索的可解讀日誌異常偵測系統

李宇軒; Yu-Hsuan Lee

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊裕澤	zh_TW
dc.contributor.advisor	Yuh-Jzer Joung	en
dc.contributor.author	李宇軒	zh_TW
dc.contributor.author	Yu-Hsuan Lee	en
dc.date.accessioned	2025-08-19T16:27:56Z	-
dc.date.available	2025-08-20	-
dc.date.copyright	2025-08-19	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-11	-
dc.identifier.citation	[1] S.A.and S.R. A systematic review of Explainable Artificial Intelligence models and applications: Recent developments and future trends. Decision Analytics Journal, 7:100230, June 2023. [2] T. Ahmed, S. Ghosh, C. Bansal, T. Zimmermann, X. Zhang, and S. Rajmo- han. Recommending Root-Cause and Mitigation Steps for Cloud Incidents Using Large Language Models. In Proceedings of the 45th International Conference on Software Engineering, ICSE ’23, pages 1737–1749, Melbourne, Victoria, Australia, 7 月 26, 2023. IEEE Press. [3] K. Alam, K. Kifayat, G. A. Sampedro, V. Karovič, and T. Naeem. SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data. IEEE Access, 12:95659–95672, 2024. [4] A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Bar- bado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58:82–115, June 2020. [5] J. Chen, H. Lin, X. Han, and L. Sun. Benchmarking large lan- guage models in retrieval-augmented generation. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and FourteenthSymposium on EducationalAdvances in ArtificialIntelligence, volume 38 of AAAI’24/IAAI’24/EAAI’24, pages 17754–17762. AAAI Press, 2 月 20, 2024. [6] Y. Chen, R. Wang, H. Jiang, S. Shi, and R. Xu. Exploring the Use of Large Lan- guage Models for Reference-Free Text Quality Evaluation: An Empirical Study, Sept. 2023. [7] Z. Chen, J. Liu, W. Gu, Y. Su, and M. R. Lyu. Experience Report: Deep Learning- based System Log Analysis for Anomaly Detection, Jan. 2022. [8] M. Cinque, D. Cotroneo, and A. Pecchia. Event logs for the analysis of software failures: A rule-based approach. IEEE Trans. Softw. Eng., 39(6):806–821, June 2013. [9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, May 2019. [10] M. Du, F. Li, G. Zheng, and V. Srikumar. DeepLog: Anomaly Detection and Di- agnosis from System Logs through Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, pages 1285–1298, New York, NY, USA, 10 月 30, 2017. Association for Computing Machinery. [11] C. Egersdoerfer, D. Zhang, and D. Dai. Early Exploration of Using Chat- GPT for Log-based Anomaly Detection on Parallel File Systems Logs. In Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC ’23, pages 315–316, New York, NY, USA, 8 月 7, 2023. Association for Computing Machinery. [12] Q. Fu, J.-G. Lou, Y. Wang, and J. Li. Execution Anomaly Detec- tion in Distributed Systems through Unstructured Log Analysis. In 2009 Ninth IEEE International Conference on Data Mining, pages 149–158, Feb. 2009. [13] Y.Gao,Y.Xiong,X.Gao,K.Jia,J.Pan,Y.Bi,Y.Dai,J.Sun,M.Wang,andH.Wang. Retrieval-augmented generation for large language models: A survey, 2024. [14] W. Guan, J. Cao, S. Qian, and J. Gao. LogLLM: Log-based Anomaly Detection Using Large Language Models, Nov. 2024. [15] H. Guo, S. Yuan, and X. Wu. LogBERT: Log Anomaly Detection via BERT, Mar. 2021. [16] P. He, J. Zhu, Z. Zheng, and M. R. Lyu. Drain: An Online Log Parsing Approach with Fixed Depth Tree. In 2017 IEEE International Conference on Web Services (ICWS), pages 33–40, June 2017. [17] J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R. McHardy. Chal- lenges and Applications of Large Language Models, July 2023. [18] A. Kong, S. Zhao, H. Chen, Q. Li, Y. Qin, R. Sun, X. Zhou, E. Wang, and X. Dong. Better Zero-Shot Reasoning with Role-Play Prompting. In K. Duh, H. Gomez, and S. Bethard, editors, Proceedings of the 2024 Conference of the NorthAmericanChapter of the Association for ComputationalLinguistics: Human Language Technologies (Volume 1: Long Papers), pages 4099–4113, Mex- ico City, Mexico, June 2024. Association for Computational Linguistics. [19] M. Landauer, S. Onder, F. Skopik, and M. Wurzenberger. Deep Learning for Anomaly Detection in Log Data: A Survey. Machine Learning with Applications, 12:100470, June 2023. [20] V.-H. Le and H. Zhang. Log-based anomaly detection with deep learning: How far are we? In Proceedings of the 44th International Conference on Software Engineering, ICSE ’22, pages 1356–1367, New York, NY, USA, 7 月 5, 2022. Association for Computing Machinery. [21] Y. Lee, J. Kim, and P. Kang. LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model, July 2023. [22] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, and D. Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Apr. 2021. [23] Z. Li, Y. Zhu, and M. Van Leeuwen. A Survey on Explainable Anomaly Detection. ACM Trans. Knowl. Discov. Data, 18(1):23:1–23:54, 9 月 6, 2023. [24] Y. Liu, S. Tao, W. Meng, J. Wang, W. Ma, Y. Chen, Y. Zhao, H. Yang, and Y. Jiang. Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension, ICPC ’24, pages 35–46, New York, NY, USA, 6 月 13, 2024. Association for Computing Machinery. [25] S. Lupton, H. Washizaki, N. Yoshioka, and Y. Fukazawa. Landscape and Taxon- omy of Online Parser-Supported Log Anomaly Detection Methods. IEEE Access, 12:78193–78218, 2024. [26] W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun, and R. Zhou. LogAnomaly: Unsupervised Detection of Se- quential and Quantitative Anomalies in Unstructured Logs. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pages 4739–4745, Macao, China, Aug. 2019. International Joint Conferences on Artificial Intelligence Organization. [27] A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, and S. Jaiswal. Graph2vec: Learning Distributed Representations of Graphs, July 2017. [28] A. Oliner and J. Stearley. What Supercomputers Say: A Study of Five System Logs. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pages 575–584, June 2007. [29] J. Pan, S. L. Wong, and Y. Yuan. RAGLog: Log Anomaly Detection using Retrieval Augmented Generation, Nov. 2023. [30] J. Qi, S. Huang, Z. Luan, S. Yang, C. Fung, H. Yang, D. Qian, J. Shang, Z. Xiao, and Z. Wu. LogGPT: Exploring ChatGPT for Log-Based Anomaly De- tection. In 2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pages 273–280, Feb. 2023. [31] P. P. Ray. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3:121–154, Jan. 2023. [32] P. Sahoo, A. K. Singh, S. Saha, V. Jain, S. Mondal, and A. Chadha. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Appli- cations, Mar. 2025. [33] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample. LLaMA: Open and Efficient Foundation Language Models, Feb. 2023. [34] X. Wang, K. J. Kim, Y. Wang, T. Koike-Akino, and K. Parsons. Deep- EAD: Explainable Anomaly Detection from System Logs. In ICC 2023 - IEEE International Conference on Communications, pages 771–776, May 2023. [35] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou. Chain-of-Thought Prompting Elicits Reasoning in Large Language Mod- els, Jan. 2023. [36] K.Xu,W.Hu,J.Leskovec,andS.Jegelka.Howpowerfularegraphneuralnetworks? In International Conference on Learning Representations, 2019. [37] W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan. Detecting large-scale sys- tem problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pages 117–132, Big Sky Montana USA, Oct. 2009. ACM. [38] Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen. Graph contrastive learn- ing with augmentations. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, pages 5812–5823, Red Hook, NY, USA, 12 月 6, 2020. Curran Associates Inc. [39] T. Zhang, X. Huang, W. Zhao, S. Bian, and P. Du. LogPrompt: A Log-based Anomaly Detection Framework Using Prompts. In 2023 International Joint Conference on Neural Networks (IJCNN), pages 1–8, June 2023. [40] W. Zhang, Q. Zhang, E. Yu, Y. Ren, Y. Meng, M. Qiu, and J. Wang. Leveraging RAG-Enhanced Large Language Model for Semi-Supervised Log Anomaly Detection. In 2024 IEEE 35th InternationalSymposium on Software Reliability Engineering (ISSRE), pages 168–179, Oct. 2024. [41] X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li, J. Chen, X. He, R. Yao, J.-G. Lou, M. Chintalapati, F. Shen, and D. Zhang. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pages 807–817, New York, NY, USA, 8 月 12, 2019. Association for Computing Machin- ery. [42] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, Y. Du, C. Yang, Y. Chen, Z. Chen, J. Jiang, R. Ren, Y. Li, X. Tang, Z. Liu, P. Liu, J.-Y. Nie, and J.-R. Wen. A Survey of Large Language Models, Oct. 2024. [43] J. Zhu, S. He, P. He, J. Liu, and M. R. Lyu. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics, Sept. 2023.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856	-
dc.description.abstract	系統日誌是記錄系統狀態和關鍵事件的重要工具，日誌異常偵測的目標是分析這些資料以識別與正常行為模式不符的事件或序列，進而判斷系統是否存在潛在問題。然而，隨著系統規模日益擴大，傳統依賴人工或舊有自動化方法在提供可解釋性方面存在局限性。儘管基於大型語言模型（LLM）的方法能夠生成可讀性解釋，但它們常受限於上下文窗口大小，難以處理大規模日誌。此外，過去將檢索增強生成（RAG）應用於此任務的研究，多以單行日誌作為知識庫單位，忽略了日誌事件序列中的結構關係。本研究為了解決上述問題，提出了一個名為GRAGLog 的日誌異常偵測系統。GRAGLog 透過將日誌事件關係構建為圖結構，並利用圖嵌入模型訓練一個檢索知識庫。在偵測時，系統會從知識庫中檢索與目標日誌序列最相似的正常圖譜，將其作為補充資訊，引導大型語言模型進行推理，最終判斷異常並提供詳細解釋。實驗結果顯示，GRAGLog 在多種料分割與窗口大小設定下，其異常偵測的 F1-score 表現穩定且優於大多數現有方法，尤其在處理長序列時更具優勢。此外，其生成的解釋內容在實用性與可讀性方面，經由人類專家和 GPT-4.1 評分後均獲得最高分。	zh_TW
dc.description.abstract	System logs are a critical tool for recording system states and key events, and the goal of log anomaly detection is to analyze this data to identify events or sequences that deviate from normal behavioral patterns, thereby determining if the system has potential problems. However, as system scale expands, traditional manual or older automated methods have limitations in providing interpretability. Although methods based on Large Language Models (LLMs) can generate readable explanations, they are often limited by a fixed context window size, making it difficult to process large-scale logs. Furthermore, past research applying Retrieval-Augmented Generation (RAG) to this task has largely used single log lines as the unit for the knowledge base, ignoring the structural relationships within log event sequences. To address these issues, this study proposes a log anomaly detection system named GRAGLog. GRAGLog constructs log event relationships into a graph structure , and uses a graph embedding model to train a retrieval knowledge base. During detection, the system retrieves the most similar normal graph from the knowledge base to the target log sequence, uses it as supplementary information to guide the LLM's reasoning, and ultimately makes an anomaly judgment and provides a detailed explanation. Experimental results show that GRAGLog's anomaly detection F1-score is stable and superior to most existing methods across various data splitting and window size settings, with a particular advantage in handling long sequences. In addition, the explanations generated by GRAGLog received the highest scores for both usefulness and readability after being evaluated by human experts and GPT-4.1.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-19T16:27:56Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-19T16:27:56Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 摘要 ii Abstract iii 目次 v 圖次 viii 表次 ix 第一章緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 第二章文獻探討 4 2.1 簡介 4 2.2 可解釋人工智慧 5 2.3 日誌異常偵測 6 2.3.1 基於深度學習的日誌異常偵測方法 6 2.3.2 可解釋的日誌異常偵測 7 2.4 大型語言模型 8 2.4.1 簡介 8 2.4.2 檢索增強生成 8 2.4.3 大型語言模型在日誌異常偵測中的應用 9 2.5 圖與圖嵌入技術 11 2.5.1 圖表示學習 11 2.6 總結 12 第三章研究方法 14 3.1 研究架構 14 3.1.1 日誌解析（Log Parsing） 15 3.1.2 序列處理（Sequence Processing） 16 3.1.2.1 序列切分（Sequence Segmentation） 16 3.1.2.2 序列壓縮（Sequence Compression） 16 3.1.3 圖建構（Graph Construction) 17 3.1.4 圖嵌入模型訓練（Graph Embedding Model Training） 19 3.1.4.1 資料不平衡處理（Data Imbalance Handling） 20 3.1.4.2 資料增強（Data Augmentation） 21 3.1.4.3 圖編碼（Graph Encoding） 24 3.1.4.4 監督式對比學習（Supervised Contrastive Learning）25 3.1.5 知識檢索（Knowledge Retrieval） 26 3.1.6 提示建構（Prompt Construction） 27 3.1.7 異常偵測（Anomaly Detection） 28 3.2 資料集 28 3.2.1 Hadoop Distributed File System (HDFS) 29 3.2.2 BlueGene/L (BGL) 29 3.3 評估指標 30 3.3.1 異常偵測評估指標 30 3.3.2 異常解釋評估指標 31 3.4 實驗設計 32 3.4.1 資料處理策略 33 3.4.2 解釋實驗設計 34 第四章研究結果 35 4.1 與基於 LLM 方法的效果比較 35 4.2 與基於深度學習方法的效果比較 38 第五章結論 41 5.1 研究成果 41 5.2 研究限制與未來展望 42 參考文獻 43 附錄 A — 其他方法討論 50 A.1 不同圖建構方式的討論 50 附錄 B — 提示 52 B.1 異常偵測提示 52 B.2 以 GPT 4.1 評分解釋時的提示 56	-
dc.language.iso	zh_TW	-
dc.subject	日誌異常偵測	zh_TW
dc.subject	日誌分析	zh_TW
dc.subject	大型語言模型	zh_TW
dc.subject	檢索增強生成	zh_TW
dc.subject	Log Analysis	en
dc.subject	Large Language Model	en
dc.subject	Log Anomaly Detection	en
dc.subject	Retrieval-Augmented Generation	en
dc.title	基於大型語言模型與知識檢索的可解讀日誌異常偵測系統	zh_TW
dc.title	GRAGLog: Interpretable Log Anomaly Detection Using Large Language Models and Knowledge Retrieval	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	查士朝;楊立偉;魏志平	zh_TW
dc.contributor.oralexamcommittee	Shi-Cho Cha;Li-wei Yang;Chih-Ping Wei	en
dc.subject.keyword	日誌異常偵測,日誌分析,大型語言模型,檢索增強生成,	zh_TW
dc.subject.keyword	Log Anomaly Detection,Log Analysis,Large Language Model,Retrieval-Augmented Generation,	en
dc.relation.page	58	-
dc.identifier.doi	10.6342/NTU202503830	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-08-14	-
dc.contributor.author-college	管理學院	-
dc.contributor.author-dept	資訊管理學系	-
dc.date.embargo-lift	2025-08-20	-
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	788.87 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。