請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856| 標題: | 基於大型語言模型與知識檢索的可解讀日誌異常偵測系統 GRAGLog: Interpretable Log Anomaly Detection Using Large Language Models and Knowledge Retrieval |
| 作者: | 李宇軒 Yu-Hsuan Lee |
| 指導教授: | 莊裕澤 Yuh-Jzer Joung |
| 關鍵字: | 日誌異常偵測,日誌分析,大型語言模型,檢索增強生成, Log Anomaly Detection,Log Analysis,Large Language Model,Retrieval-Augmented Generation, |
| 出版年 : | 2025 |
| 學位: | 碩士 |
| 摘要: | 系統日誌是記錄系統狀態和關鍵事件的重要工具,日誌異常偵測的目標是分析這些資料以識別與正常行為模式不符的事件或序列,進而判斷系統是否存在潛在問題。然而,隨著系統規模日益擴大,傳統依賴人工或舊有自動化方法在提供可解釋性方面存在局限性。儘管基於大型語言模型(LLM)的方法能夠生成可讀性解釋,但它們常受限於上下文窗口大小,難以處理大規模日誌。此外,過去將檢索增強生成(RAG)應用於此任務的研究,多以單行日誌作為知識庫單位,忽略了日誌事件序列中的結構關係。本研究為了解決上述問題,提出了一個名為GRAGLog 的日誌異常偵測系統。GRAGLog 透過將日誌事件關係構建為圖結構,並利用圖嵌入模型訓練一個檢索知識庫。在偵測時,系統會從知識庫中檢索與目標日誌序列最相似的正常圖譜,將其作為補充資訊,引導大型語言模型進行推理,最終判斷異常並提供詳細解釋。實驗結果顯示,GRAGLog 在多種料分割與窗口大小設定下,其異常偵測的 F1-score 表現穩定且優於大多數現有方法,尤其在處理長序列時更具優勢。此外,其生成的解釋內容在實用性與可讀性方面,經由人類專家和 GPT-4.1 評分後均獲得最高分。 System logs are a critical tool for recording system states and key events, and the goal of log anomaly detection is to analyze this data to identify events or sequences that deviate from normal behavioral patterns, thereby determining if the system has potential problems. However, as system scale expands, traditional manual or older automated methods have limitations in providing interpretability. Although methods based on Large Language Models (LLMs) can generate readable explanations, they are often limited by a fixed context window size, making it difficult to process large-scale logs. Furthermore, past research applying Retrieval-Augmented Generation (RAG) to this task has largely used single log lines as the unit for the knowledge base, ignoring the structural relationships within log event sequences. To address these issues, this study proposes a log anomaly detection system named GRAGLog. GRAGLog constructs log event relationships into a graph structure , and uses a graph embedding model to train a retrieval knowledge base. During detection, the system retrieves the most similar normal graph from the knowledge base to the target log sequence, uses it as supplementary information to guide the LLM's reasoning, and ultimately makes an anomaly judgment and provides a detailed explanation. Experimental results show that GRAGLog's anomaly detection F1-score is stable and superior to most existing methods across various data splitting and window size settings, with a particular advantage in handling long sequences. In addition, the explanations generated by GRAGLog received the highest scores for both usefulness and readability after being evaluated by human experts and GPT-4.1. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98856 |
| DOI: | 10.6342/NTU202503830 |
| 全文授權: | 同意授權(限校園內公開) |
| 電子全文公開日期: | 2025-08-20 |
| 顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 788.87 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
