Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊裕澤zh_TW
dc.contributor.advisorYuh-Jzer Joungen
dc.contributor.author李宇軒zh_TW
dc.contributor.authorYu-Hsuan Leeen
dc.date.accessioned2025-08-19T16:27:56Z-
dc.date.available2025-08-20-
dc.date.copyright2025-08-19-
dc.date.issued2025-
dc.date.submitted2025-08-11-
dc.identifier.citation[1] S.A.and S.R. A systematic review of Explainable Artificial Intelligence models and applications: Recent developments and future trends. Decision Analytics Journal, 7:100230, June 2023.
[2] T. Ahmed, S. Ghosh, C. Bansal, T. Zimmermann, X. Zhang, and S. Rajmo- han. Recommending Root-Cause and Mitigation Steps for Cloud Incidents Using Large Language Models. In Proceedings of the 45th International Conference on Software Engineering, ICSE ’23, pages 1737–1749, Melbourne, Victoria, Australia, 7 月 26, 2023. IEEE Press.
[3] K. Alam, K. Kifayat, G. A. Sampedro, V. Karovič, and T. Naeem. SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data. IEEE Access, 12:95659–95672, 2024.
[4] A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Bar- bado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58:82–115, June 2020.
[5] J. Chen, H. Lin, X. Han, and L. Sun. Benchmarking large lan- guage models in retrieval-augmented generation. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and FourteenthSymposium on EducationalAdvances in ArtificialIntelligence, volume 38 of AAAI’24/IAAI’24/EAAI’24, pages 17754–17762. AAAI Press, 2 月 20, 2024.
[6] Y. Chen, R. Wang, H. Jiang, S. Shi, and R. Xu. Exploring the Use of Large Lan- guage Models for Reference-Free Text Quality Evaluation: An Empirical Study, Sept. 2023.
[7] Z. Chen, J. Liu, W. Gu, Y. Su, and M. R. Lyu. Experience Report: Deep Learning- based System Log Analysis for Anomaly Detection, Jan. 2022.
[8] M. Cinque, D. Cotroneo, and A. Pecchia. Event logs for the analysis of software failures: A rule-based approach. IEEE Trans. Softw. Eng., 39(6):806–821, June 2013.
[9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, May 2019.
[10] M. Du, F. Li, G. Zheng, and V. Srikumar. DeepLog: Anomaly Detection and Di- agnosis from System Logs through Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, pages 1285–1298, New York, NY, USA, 10 月 30, 2017. Association for Computing Machinery.
[11] C. Egersdoerfer, D. Zhang, and D. Dai. Early Exploration of Using Chat- GPT for Log-based Anomaly Detection on Parallel File Systems Logs. In Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC ’23, pages 315–316, New York, NY, USA, 8 月 7, 2023. Association for Computing Machinery.
[12] Q. Fu, J.-G. Lou, Y. Wang, and J. Li. Execution Anomaly Detec- tion in Distributed Systems through Unstructured Log Analysis. In 2009 Ninth IEEE International Conference on Data Mining, pages 149–158, Feb. 2009.
[13] Y.Gao,Y.Xiong,X.Gao,K.Jia,J.Pan,Y.Bi,Y.Dai,J.Sun,M.Wang,andH.Wang. Retrieval-augmented generation for large language models: A survey, 2024.
[14] W. Guan, J. Cao, S. Qian, and J. Gao. LogLLM: Log-based Anomaly Detection Using Large Language Models, Nov. 2024.
[15] H. Guo, S. Yuan, and X. Wu. LogBERT: Log Anomaly Detection via BERT, Mar. 2021.
[16] P. He, J. Zhu, Z. Zheng, and M. R. Lyu. Drain: An Online Log Parsing Approach with Fixed Depth Tree. In 2017 IEEE International Conference on Web Services (ICWS), pages 33–40, June 2017.
[17] J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R. McHardy. Chal- lenges and Applications of Large Language Models, July 2023.
[18] A. Kong, S. Zhao, H. Chen, Q. Li, Y. Qin, R. Sun, X. Zhou, E. Wang, and X. Dong. Better Zero-Shot Reasoning with Role-Play Prompting. In K. Duh, H. Gomez, and S. Bethard, editors, Proceedings of the 2024 Conference of the NorthAmericanChapter of the Association for ComputationalLinguistics: Human Language Technologies (Volume 1: Long Papers), pages 4099–4113, Mex- ico City, Mexico, June 2024. Association for Computational Linguistics.
[19] M. Landauer, S. Onder, F. Skopik, and M. Wurzenberger. Deep Learning for Anomaly Detection in Log Data: A Survey. Machine Learning with Applications, 12:100470, June 2023.
[20] V.-H. Le and H. Zhang. Log-based anomaly detection with deep learning: How far are we? In Proceedings of the 44th International Conference on Software Engineering, ICSE ’22, pages 1356–1367, New York, NY, USA, 7 月 5, 2022. Association for Computing Machinery.
[21] Y. Lee, J. Kim, and P. Kang. LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model, July 2023.
[22] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, and D. Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Apr. 2021.
[23] Z. Li, Y. Zhu, and M. Van Leeuwen. A Survey on Explainable Anomaly Detection. ACM Trans. Knowl. Discov. Data, 18(1):23:1–23:54, 9 月 6, 2023.
[24] Y. Liu, S. Tao, W. Meng, J. Wang, W. Ma, Y. Chen, Y. Zhao, H. Yang, and Y. Jiang. Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension, ICPC ’24, pages 35–46, New York, NY, USA, 6 月 13, 2024. Association for Computing Machinery.
[25] S. Lupton, H. Washizaki, N. Yoshioka, and Y. Fukazawa. Landscape and Taxon- omy of Online Parser-Supported Log Anomaly Detection Methods. IEEE Access, 12:78193–78218, 2024.
[26] W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun, and R. Zhou. LogAnomaly: Unsupervised Detection of Se- quential and Quantitative Anomalies in Unstructured Logs. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pages 4739–4745, Macao, China, Aug. 2019. International Joint Conferences on Artificial Intelligence Organization.
[27] A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, and S. Jaiswal. Graph2vec: Learning Distributed Representations of Graphs, July 2017.
[28] A. Oliner and J. Stearley. What Supercomputers Say: A Study of Five System Logs. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pages 575–584, June 2007.
[29] J. Pan, S. L. Wong, and Y. Yuan. RAGLog: Log Anomaly Detection using Retrieval Augmented Generation, Nov. 2023.
[30] J. Qi, S. Huang, Z. Luan, S. Yang, C. Fung, H. Yang, D. Qian, J. Shang, Z. Xiao, and Z. Wu. LogGPT: Exploring ChatGPT for Log-Based Anomaly De- tection. In 2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pages 273–280, Feb. 2023.
[31] P. P. Ray. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3:121–154, Jan. 2023.
[32] P. Sahoo, A. K. Singh, S. Saha, V. Jain, S. Mondal, and A. Chadha. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Appli- cations, Mar. 2025.
[33] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample. LLaMA: Open and Efficient Foundation Language Models, Feb. 2023.
[34] X. Wang, K. J. Kim, Y. Wang, T. Koike-Akino, and K. Parsons. Deep- EAD: Explainable Anomaly Detection from System Logs. In ICC 2023 - IEEE International Conference on Communications, pages 771–776, May 2023.
[35] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou. Chain-of-Thought Prompting Elicits Reasoning in Large Language Mod- els, Jan. 2023.
[36] K.Xu,W.Hu,J.Leskovec,andS.Jegelka.Howpowerfularegraphneuralnetworks? In International Conference on Learning Representations, 2019.
[37] W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan. Detecting large-scale sys- tem problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pages 117–132, Big Sky Montana USA, Oct. 2009. ACM.
[38] Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen. Graph contrastive learn- ing with augmentations. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, pages 5812–5823, Red Hook, NY, USA, 12 月 6, 2020. Curran Associates Inc.
[39] T. Zhang, X. Huang, W. Zhao, S. Bian, and P. Du. LogPrompt: A Log-based Anomaly Detection Framework Using Prompts. In 2023 International Joint Conference on Neural Networks (IJCNN), pages 1–8, June 2023.
[40] W. Zhang, Q. Zhang, E. Yu, Y. Ren, Y. Meng, M. Qiu, and J. Wang. Leveraging RAG-Enhanced Large Language Model for Semi-Supervised Log Anomaly Detection. In 2024 IEEE 35th InternationalSymposium on Software Reliability Engineering (ISSRE), pages 168–179, Oct. 2024.
[41] X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li, J. Chen, X. He, R. Yao, J.-G. Lou, M. Chintalapati, F. Shen, and D. Zhang. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pages 807–817, New York, NY, USA, 8 月 12, 2019. Association for Computing Machin- ery.
[42] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, Y. Du, C. Yang, Y. Chen, Z. Chen, J. Jiang, R. Ren, Y. Li, X. Tang, Z. Liu, P. Liu, J.-Y. Nie, and J.-R. Wen. A Survey of Large Language Models, Oct. 2024.
[43] J. Zhu, S. He, P. He, J. Liu, and M. R. Lyu. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics, Sept. 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856-
dc.description.abstract系統日誌是記錄系統狀態和關鍵事件的重要工具,日誌異常偵測的目標是分析這些資料以識別與正常行為模式不符的事件或序列,進而判斷系統是否存在潛在問題。然而,隨著系統規模日益擴大,傳統依賴人工或舊有自動化方法在提供可解釋性方面存在局限性。儘管基於大型語言模型(LLM)的方法能夠生成可讀性解釋,但它們常受限於上下文窗口大小,難以處理大規模日誌。此外,過去將檢索增強生成(RAG)應用於此任務的研究,多以單行日誌作為知識庫單位,忽略了日誌事件序列中的結構關係。本研究為了解決上述問題,提出了一個名為GRAGLog 的日誌異常偵測系統。GRAGLog 透過將日誌事件關係構建為圖結構,並利用圖嵌入模型訓練一個檢索知識庫。在偵測時,系統會從知識庫中檢索與目標日誌序列最相似的正常圖譜,將其作為補充資訊,引導大型語言模型進行推理,最終判斷異常並提供詳細解釋。實驗結果顯示,GRAGLog 在多種料分割與窗口大小設定下,其異常偵測的 F1-score 表現穩定且優於大多數現有方法,尤其在處理長序列時更具優勢。此外,其生成的解釋內容在實用性與可讀性方面,經由人類專家和 GPT-4.1 評分後均獲得最高分。zh_TW
dc.description.abstractSystem logs are a critical tool for recording system states and key events, and the goal of log anomaly detection is to analyze this data to identify events or sequences that deviate from normal behavioral patterns, thereby determining if the system has potential problems. However, as system scale expands, traditional manual or older automated methods have limitations in providing interpretability. Although methods based on Large Language Models (LLMs) can generate readable explanations, they are often limited by a fixed context window size, making it difficult to process large-scale logs. Furthermore, past research applying Retrieval-Augmented Generation (RAG) to this task has largely used single log lines as the unit for the knowledge base, ignoring the structural relationships within log event sequences. To address these issues, this study proposes a log anomaly detection system named GRAGLog. GRAGLog constructs log event relationships into a graph structure , and uses a graph embedding model to train a retrieval knowledge base. During detection, the system retrieves the most similar normal graph from the knowledge base to the target log sequence, uses it as supplementary information to guide the LLM's reasoning, and ultimately makes an anomaly judgment and provides a detailed explanation. Experimental results show that GRAGLog's anomaly detection F1-score is stable and superior to most existing methods across various data splitting and window size settings, with a particular advantage in handling long sequences. In addition, the explanations generated by GRAGLog received the highest scores for both usefulness and readability after being evaluated by human experts and GPT-4.1.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-19T16:27:56Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-19T16:27:56Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 i
摘要 ii
Abstract iii
目次 v
圖次 viii
表次 ix
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
第二章 文獻探討 4
2.1 簡介 4
2.2 可解釋人工智慧 5
2.3 日誌異常偵測 6
2.3.1 基於深度學習的日誌異常偵測方法 6
2.3.2 可解釋的日誌異常偵測 7
2.4 大型語言模型 8
2.4.1 簡介 8
2.4.2 檢索增強生成 8
2.4.3 大型語言模型在日誌異常偵測中的應用 9
2.5 圖與圖嵌入技術 11
2.5.1 圖表示學習 11
2.6 總結 12
第三章 研究方法 14
3.1 研究架構 14
3.1.1 日誌解析(Log Parsing) 15
3.1.2 序列處理(Sequence Processing) 16
3.1.2.1 序列切分(Sequence Segmentation) 16
3.1.2.2 序列壓縮(Sequence Compression) 16
3.1.3 圖建構(Graph Construction) 17
3.1.4 圖嵌入模型訓練(Graph Embedding Model Training) 19
3.1.4.1 資料不平衡處理(Data Imbalance Handling) 20
3.1.4.2 資料增強(Data Augmentation) 21
3.1.4.3 圖編碼(Graph Encoding) 24
3.1.4.4 監督式對比學習(Supervised Contrastive Learning)25
3.1.5 知識檢索(Knowledge Retrieval) 26
3.1.6 提示建構(Prompt Construction) 27
3.1.7 異常偵測(Anomaly Detection) 28
3.2 資料集 28
3.2.1 Hadoop Distributed File System (HDFS) 29
3.2.2 BlueGene/L (BGL) 29
3.3 評估指標 30
3.3.1 異常偵測評估指標 30
3.3.2 異常解釋評估指標 31
3.4 實驗設計 32
3.4.1 資料處理策略 33
3.4.2 解釋實驗設計 34
第四章 研究結果 35
4.1 與基於 LLM 方法的效果比較 35
4.2 與基於深度學習方法的效果比較 38
第五章 結論 41
5.1 研究成果 41
5.2 研究限制與未來展望 42
參考文獻 43
附錄 A — 其他方法討論 50
A.1 不同圖建構方式的討論 50
附錄 B — 提示 52
B.1 異常偵測提示 52
B.2 以 GPT 4.1 評分解釋時的提示 56
-
dc.language.isozh_TW-
dc.subject日誌異常偵測zh_TW
dc.subject日誌分析zh_TW
dc.subject大型語言模型zh_TW
dc.subject檢索增強生成zh_TW
dc.subjectLog Analysisen
dc.subjectLarge Language Modelen
dc.subjectLog Anomaly Detectionen
dc.subjectRetrieval-Augmented Generationen
dc.title基於大型語言模型與知識檢索的可解讀日誌異常偵測系統zh_TW
dc.titleGRAGLog: Interpretable Log Anomaly Detection Using Large Language Models and Knowledge Retrievalen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee查士朝;楊立偉;魏志平zh_TW
dc.contributor.oralexamcommitteeShi-Cho Cha;Li-wei Yang;Chih-Ping Weien
dc.subject.keyword日誌異常偵測,日誌分析,大型語言模型,檢索增強生成,zh_TW
dc.subject.keywordLog Anomaly Detection,Log Analysis,Large Language Model,Retrieval-Augmented Generation,en
dc.relation.page58-
dc.identifier.doi10.6342/NTU202503830-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2025-08-14-
dc.contributor.author-college管理學院-
dc.contributor.author-dept資訊管理學系-
dc.date.embargo-lift2025-08-20-
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
788.87 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved