Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99637
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor廖婉君zh_TW
dc.contributor.advisorWan-Jiun Liaoen
dc.contributor.author許馨云zh_TW
dc.contributor.authorHsin-Yun Hsuen
dc.date.accessioned2025-09-17T16:13:31Z-
dc.date.available2025-09-18-
dc.date.copyright2025-09-17-
dc.date.issued2025-
dc.date.submitted2025-08-11-
dc.identifier.citation[1] Yingying He and Xiaobing Pei. Semi-supervised learning via dqn for log anomaly detection. arXiv preprint arXiv:2401.03151, 2024.
[2] Crispin Almodovar, Fariza Sabrina, Sarvnaz Karimi, and Salahuddin Azad. Logfit: Log anomaly detection using fine-tuned language models. IEEE Transactions on Network and Service Management, 21(2):1715–1723, 2024.
[3] Xiao Han, Shuhan Yuan, and Mohamed Trabelsi. Loggpt: Log anomaly detection via gpt. In 2023 IEEE International Conference on Big Data (BigData), pages 1117–1122. IEEE, 2023.
[4] Tong Xiao, Zhe Quan, Zhi-Jie Wang, Yuquan Le, Yunfei Du, Xiangke Liao, Kenli Li, and Keqin Li. Loader: A log anomaly detector based on transformer. IEEE Transactions on Services Computing, 16(5):3479–3492, 2023.
[5] Junwei Zhou and Yijia Qian. Auglog: System log anomaly detection based on contrastive learning and data augmentation. In 2022 5th International Conference on Data Science and Information Technology (DSIT), pages 1–7. IEEE, 2022.
[6] Song Chen and Hai Liao. Bert-log: Anomaly detection for system logs based on pre-trained language model. Applied Artificial Intelligence, 36(1):2145642, 2022.
[7] Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 1285–1298, 2017.
[8] Chiming Duan, Tong Jia, Huaqian Cai, Ying Li, and Gang Huang. Afalog: A general augmentation framework for log-based anomaly detection with active learning. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pages 46–56. IEEE, 2023.
[9] Xiaotian Guo, Quan Jiang, Yixian Shen, Andy D Pimentel, and Todor Stefanov. Easter: Learning to split transformers at the edge robustly. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 43(11):3626–3637, 2024.
[10] Sehoon Kim, Sheng Shen, David Thorsley, Amir Gholami, Woosuk Kwon, Joseph Hassoun, and Kurt Keutzer. Learned token pruning for transformers. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 784–794, 2022.
[11] Junyan Li, Li Lyna Zhang, Jiahang Xu, Yujing Wang, Shaoguang Yan, Yunqing Xia, YuqingYang, TingCao, Hao Sun, Weiwei Deng, et al. Constraint-aware and ranking-distilled token pruning for efficient transformer inference. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1280–1290, 2023.
[12] Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, and Qun Liu. Dynabert: Dynamic bert with adaptive width and depth. Advances in Neural Information Processing Systems, 33:9782–9793, 2020.
[13] Gyuwan Kim and Kyunghyun Cho. Length-adaptive transformer: Train once with length drop, use anytime with search. arXiv preprint arXiv:2010.07003, 2020.
[14] Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I Jordan. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 117–132, 2009.
[15] Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, and Michael R Lyu. Loghub: A large collection of system log datasets for ai-driven log analytics. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pages 355–366. IEEE, 2023.
[16] Adam Oliner and Jon Stearley. What supercomputers say: A study of five system logs. In 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07), pages 575–584. IEEE, 2007.
[17] Mohamed Dhouib, Davide Buscaldi, Sonia Vanier, and Aymen Shabou. Pact: Pruning and clustering-based token reduction for faster visual language models. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 14582–14592, 2025.
[18] Zheng Zhan, Zhenglun Kong, Yifan Gong, Yushu Wu, Zichong Meng, Hangyu Zheng, Xuan Shen, Stratis Ioannidis, Wei Niu, Pu Zhao, et al. Exploring token pruning in vision state space models. arXiv preprint arXiv:2409.18962, 2024.
[19] Xuan Li, Naiyu Wang, Liehuang Zhu, Shuai Yuan, and Zhitao Guan. Fuse: a federated learning and u-shape split learning-based electricity theft detection framework. Science China Information Sciences, 67(4):149302, 2024.
[20] Junlin Liu, Xinchen Lyu, Qimei Cui, and Xiaofeng Tao. Similarity-based label inference attack against training and inference of split learning. IEEE Transactions on Information Forensics and Security, 19:2881–2895, 2024.
[21] Siyu Sheng, Junfeng Jing, Zhen Wang, and Huanhuan Zhang. Cosine similarity knowledge distillation for surface anomaly detection. Scientific Reports, 14(1):8150, 2024.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99637-
dc.description.abstract在系統日誌異常偵測情境中,近期研究多利用深度學習模型將大量人類語言的資料分析轉換並檢測。本研究提出一種「語意保留的 Token 剪枝方法」,以減少 Split Inference 架構中產生的通訊閘頓。我們所設計的模型可在無需標註資料的前提下,利用餘弦相似度將剪枝後與未剪枝的向量計算出 Loss,以此去學習出最適當剪枝比例的預測器,並搭配 L2 norm 作為 token 的重要性分數的作法。可以在不影響原先異常檢測模型的表現下,保留一定語意資訊,並盡可能地減少 activation 大小。實驗結果顯示,所提出方法可在不影響原先的準確率下,在多種日誌資料集上大幅降低 activation 傳輸量。zh_TW
dc.description.abstractIn practical scenarios of system log anomaly detection, deep learning models are commonly used to analyze and transform large volumes of data for abnormal behavior detection. This study proposes a semantics-preserving token pruning method to reduce the intermediate communication overhead during the Split Inference architectures. The proposed model requires no labeled data for training and learns a predictor that determines the optimal pruning ratio. By integrating L2-norm as token importance scores, our method effectively preserves semantic information while minimizing the size of activations, without degrading the performance of the original anomaly detection model. Experimental results show that the proposed method can significantly reduce activation transmission size across multiple log datasets, while maintaining comparable anomaly detection accuracy.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-17T16:13:31Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-09-17T16:13:31Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 iii
Abstract v
Contents vii
List of Figures ix
List of Tables xi
Chapter 1 Introduction 1
1.1 Background and Motivation 1
1.2 Problem Statement 2
1.3 Research Objectives 3
1.4 Proposed Approach 4
1.5 Contributions 4
1.6 Thesis Organization 5
Chapter 2 Related Work 7
2.1 Log Anomaly Detection 7
2.2 Communication Cost Reduction in Split Architecture 8
2.3 Transformer-based Language Models Pruning 9
2.4 Datasets for Log Anomaly Detection 10
2.5 L2 Norm for Importance Scoring 11
2.6 Cosine Similarity as a Semantic Metric 12
Chapter 3 Proposed Solution - SPDP 15
3.1 Overall Architecture 15
3.2 Pruning Procedure 17
3.3 Prune Ratio Predictor Model 17
3.4 Training Phase 18
3.5 Token Importance Scoring via L2 Norm 19
3.6 Loss Function Design 19
3.7 Advantages of the Proposed Architecture 20
Chapter 4 Experiment 21
4.1 Experimental Settings 21
4.1.1 Datasets 21
4.1.1.1 Data Preprocessing Description 21
4.1.2 Environment 22
4.2 Evaluation Framework 22
4.3 Experimental Results 24
4.3.1 Evaluation Metrics and Visualization 24
4.3.2 Results and Analysis 25
4.4 Summary 36
Chapter 5 Conclusion 37
References 41
-
dc.language.isoen-
dc.subject分段推論zh_TW
dc.subject日誌異常偵測zh_TW
dc.subject通訊成本zh_TW
dc.subject詞元剪枝zh_TW
dc.subjectLog anomaly detectionen
dc.subjectToken pruningen
dc.subjectSplit Inferenceen
dc.subjectCommunication overheaden
dc.titleSPDP: 用於減少日誌異常檢測分割推理階段中間通訊開銷的語意保留動態剪枝方法zh_TW
dc.titleSPDP: A Semantics-Preserving Dynamic Pruning Method for Reducing Intermediate Communication Overhead in Split Inference for Log Anomaly Detectionen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee周承復;林宗男;吳仁銘zh_TW
dc.contributor.oralexamcommitteeCheng-Fu Chou;Tsung-Nan Lin;Jenn-Ming Wuen
dc.subject.keyword日誌異常偵測,分段推論,詞元剪枝,通訊成本,zh_TW
dc.subject.keywordLog anomaly detection,Split Inference,Token pruning,Communication overhead,en
dc.relation.page44-
dc.identifier.doi10.6342/NTU202501762-
dc.rights.note未授權-
dc.date.accepted2025-08-13-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
dc.date.embargo-liftN/A-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
6.99 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved