請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93475完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 王奕翔 | zh_TW |
| dc.contributor.advisor | I-Hsiang Wang | en |
| dc.contributor.author | 楊欣睿 | zh_TW |
| dc.contributor.author | Hsin-Jui Yang | en |
| dc.date.accessioned | 2024-08-01T16:19:11Z | - |
| dc.date.available | 2024-08-02 | - |
| dc.date.copyright | 2024-08-01 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-07-29 | - |
| dc.identifier.citation | Ivan Beschastnikh, Yuriy Brun, Michael D Ernst, Arvind Krishnamurthy, and Thomas E Anderson. Mining temporal invariants from partially ordered logs. In Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques, pages 1–10. 2011.
Ivan Beschastnikh, Yuriy Brun, Michael D Ernst, and Arvind Krishnamurthy. Inferring models of concurrent systems from logs of their behavior with csight. In Proceedings of the 36th International Conference on Software Engineering, pages 468–479, 2014. Min Chen, Anne Trefethen, René Bañares-Alcántara, Marina Jirotka, Bob Coecke, Thomas Ertl, and Albrecht Schmidt. From data analysis and visualization to causality discovery. Computer, 44(10):84–87, 2011. Zhuangbin Chen, Jinyang Liu, Wenwei Gu, Yuxin Su, and Michael R. Lyu. Experience report: Deep learning-based system log analysis for anomaly detection. arXiv preprint arXiv:2107.05908, 2021. Min Du, Feifei Li, and Guanlin Li. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1285–1298, 2017. Clive WJ Granger. Investigating causal relations by econometric models and cross- spectral methods. Econometrica: journal of the Econometric Society, pages 424–438, 1969. Jian-Guang Lou, Qiang Fu, Shengqi Yang, Jiang Li, and Bin Wu. Mining program workflow from interleaved traces. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, page 613–622, New York, NY, USA, 2010. Association for Computing Machinery. ISBN 9781450300551. doi: 10.1145/1835804.1835883. URL https://doi.org/10.1145/ 1835804.1835883. Aurelie C Lozano, Naoki Abe, Yan Liu, and Saharon Rosset. Grouped graphical granger modeling methods for temporal causal modeling. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 577–586, 2009. Long Ma and Yanqing Zhang. Using word2vec to process big text data. In 2015 IEEE International Conference on Big Data (Big Data), pages 2895–2897. IEEE, 2015. Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, et al. Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In IJCAI, volume 19, pages 4739–4745, 2019. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013a. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013b. Stefano Monti, Pablo Tamayo, Jill Mesirov, and Todd Golub. Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine learning, 52:91–118, 2003. Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. Self-attentive classification-based anomaly detection in unstructured logs. In 2020 IEEE International Conference on Data Mining (ICDM), pages 1196–1201. IEEE, 2020. Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Causal inference on time series using restricted structural equation models. Advances in neural information processing systems, 26, 2013. Xiao Yu, Pallavi Joshi, Jianwu Xu, Guoliang Jin, Hui Zhang, and Guofei Jiang. Cloudseer: Workflow monitoring of cloud infrastructures via interleaved logs. ACM SIGARCH Computer Architecture News, 44(2):489–502, 2016. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93475 | - |
| dc.description.abstract | 我們考慮一種情景,即工程師分析用戶發送的系統日誌以解決故障。用戶通常因系統運行有缺陷的程序路徑而遇到麻煩,這會生成一系列稱為“故障事件”的模板。工程師的目標是探索未知故障事件的性質,並確定系統日誌中是否包含已知的故障事件。主要挑戰在於來自不同任務的日誌在系統日誌中交錯存在,此外,大規模系統服務會生成多種多樣的日誌。這些因素使得故障排除過程極其耗時,因為工程師需要確認系統日誌每一行之間的相關性。在本論文中,我們提出了一種新的故障排除框架,模板-模式-事件,通過將代表相同行為的日誌聚合成同一模式來減少系統日誌的複雜性。其次,我們提出了一種模板聚類算法,從具有交錯特徵的系統日誌數據中學習模式。第三,我們引入了事件追踪算法,以識別系統日誌中故障事件的位置。通過我們提出的新架構,故障排除過程將更加簡化和高效。 | zh_TW |
| dc.description.abstract | We consider the scenario where engineers analyze system logs sent from users for troubleshooting. Users typically encounter trouble due to the system running a defective program path, which generates the sequence of templates called the "trouble event." The engineers' goal is to explore the nature of unknown trouble events and to determine whether a system log contains any known trouble events. The main challenge lies in the fact that logs from different tasks are interleaved in the system log, and additionally, large-scale system services generate a wide variety of logs. These factors make the process of troubleshooting extremely time-consuming, as engineers need to confirm the relevance between logs in each line of the system log. In this thesis, we propose a new troubleshooting framework, template-pattern-event, which reduces the complexity of the system log by aggregating logs that represent the same system behavior into the same pattern. Secondly, we propose an algorithm, Template Clustering, to learn patterns from system log data with interleaving characteristics. Thirdly, we introduce the Event Trace algorithm to identify the positions of trouble events in the system log. With our proposed new architecture, the troubleshooting process will be simplified and more efficient. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-01T16:19:11Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-01T16:19:11Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgements i
摘要 iii Abstract v Contents vii List of Figures xi List of Tables xiii Denotation xv Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.1.1 Log 2 1.1.2 Event-triggered system logs 3 1.1.3 Troubleshooting 4 1.2 Challenges 6 1.3 Related Works 7 1.3.1 Anomaly Detection 7 1.3.2 Worflow Mining 9 1.3.3 Causality Analysis 9 1.4 Contribution 10 1.5 Overview of Proposed Data-Driven Troubleshooting Enhancement Framework 11 1.6 Outline of the Thesis Structure 12 Chapter 2 Problem Description and Proposed Data Mining Framework 13 2.1 Troubleshooting Scenarios and Our Goal 13 2.2 Proposed Data-Driven Troubleshooting Enhancement Framework 17 Chapter 3 Template Clustering 21 3.1 Introduction 21 3.2 Template Clustering 21 3.2.1 Assign a Distance Value to Each Pair of Templates 22 3.2.1.1 Dependency Function 22 3.2.1.2 Distance Function 25 3.2.2 Agglomerative Hierarchical Clustering 26 3.2.3 Finding Number of Cluster 27 3.3 Experiment 28 3.3.1 Synthetic Dataset 28 3.3.2 Baseline Approach 31 3.3.3 Experiment Setup 32 3.4 Result Analysis 33 3.5 Real Data Results 35 Chapter 4 Event Trace 39 4.1 Introduction 39 4.2 Determine Whether a Log Sequence is a Trouble Event 40 4.2.1 Method of Engineers: Threshold Method 40 4.2.2 Our Proposed Method 41 4.2.2.1 Lower Bound Method 43 4.2.2.2 Fit Mixture Distribution 44 4.3 Event Trace 46 4.4 Experiment 48 4.4.1 Synthetic Dataset 48 4.4.2 Baseline Method 49 4.4.3 Experiment Setup 49 4.5 Result Analysis 50 4.6 Real Data Results 51 Chapter 5 Conclusion and Futureworks 53 5.1 Strengths of Our Works 53 5.2 Limitations and Future Works 54 References 55 | - |
| dc.language.iso | en | - |
| dc.subject | 偵錯 | zh_TW |
| dc.subject | 系統日誌分析 | zh_TW |
| dc.subject | 事件穿插 | zh_TW |
| dc.subject | Interleaved system log | en |
| dc.subject | System log analysis | en |
| dc.subject | Troubleshooting | en |
| dc.title | 具事件穿插特性的系統日誌的數據挖掘分析框架 | zh_TW |
| dc.title | A Data Mining Framework for Interleaved Event-triggered Log Analysis | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 陳銘憲;陳建錦 | zh_TW |
| dc.contributor.oralexamcommittee | Ming-Syan Chen;Chien Chin Chen | en |
| dc.subject.keyword | 系統日誌分析,偵錯,事件穿插, | zh_TW |
| dc.subject.keyword | System log analysis,Troubleshooting,Interleaved system log, | en |
| dc.relation.page | 57 | - |
| dc.identifier.doi | 10.6342/NTU202402362 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2024-07-30 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 電信工程學研究所 | - |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 2.94 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
