使用ATT CK框架進行惡意軟體行為分析

Chi-Yu Lin; 林啟祐

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72327

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	謝宏昀(Hung-Yun Hsieh)
dc.contributor.author	Chi-Yu Lin	en
dc.contributor.author	林啟祐	zh_TW
dc.date.accessioned	2021-06-17T06:35:39Z	-
dc.date.available	2025-08-27
dc.date.copyright	2020-09-29
dc.date.issued	2020
dc.date.submitted	2020-08-27
dc.identifier.citation	Malwarebytes LABs, “2020 state of malware report,” 2020. Malware naming convention. Online Available at: https://docs.microsoft.com/en-us/windows/security/threat-protection/ intelligence/malware-naming Mitre att ck. Online Available at: https://attack.mitre.org/ D. Bianco, “The pyramid of pain,” Enterprise Detection Response, 2013. P. Trinius, C. Willems, T. Holz, and K. Rieck, “A malware instruction set for behavior-based analysis,” in Sicherheit 2010. Sicherheit, Schutz und Zu- verl ̈assigkeit, F. C. Freiling, Ed. Bonn: Gesellschaft fu ̈r Informatik e.V., 2010, pp. 205–215. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014. K. Cho, B. Van Merri ̈enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014. malshare. Online Available at: https://malshare.com/index.php E. Ra, J. Sylvester, and C. Nicholas, “Learning the pe header, malware detection with minimal domain knowledge,” in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 121–132. E. Ra, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K. Nicholas, “Malware detection by eating a whole exe,” in Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018. D. Ucci, L. Aniello, and R. Baldoni, “Survey of machine learning techniques for malware analysis,” Computers Security, vol. 81, pp. 123–147, 2019. Cuckoo Sandbox tool. Online Available at: https://www.cuckoosandbox.org/ P. Shijo and A. Salim, “Integrated static and dynamic analysis for malware detection,” Procedia Computer Science, vol. 46, pp. 804–811, 2015. M. F. Zolkipli and A. Jantan, “Malware behavior analysis: Learning and understanding current malware threats,” in 2010 Second International Conference on Network Applications, Protocols and Services. IEEE, 2010, pp. 218–221. E. Filiol, “Malware pattern scanning schemes secure against black-box analysis,” Journal in Computer Virology, vol. 2, no. 1, pp. 35–50, 2006. J. Aycock, “Computer viruses and malware,” 2006. A. Damodaran, F. Di Troia, C. A. Visaggio, T. H. Austin, and M. Stamp, “A comparison of static, dynamic, and hybrid analysis for malware detection,” Journal of Computer Virology and Hacking Techniques, vol. 13, no. 1, pp. 1–12, 2017. J. Halvorsen, J. Waite, and A. Hahn, “Evaluating the observability of network security monitoring strategies with tomato,” IEEE Access, vol. 7, pp. 108 304– 108 315, 2019. E. M. Hutchins, M. J. Cloppert, R. M. Amin, et al., “Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,” Leading Issues in Information Warfare Security Research, vol. 1, no. 1, p. 80, 2011. N. M. Hai, M. Ogawa, and Q. T. Tho, “Packer identification based on metadata signature,” in Proceedings of the 7th Software Security, Protection, and Reverse Engineering/Software Security and Protection Workshop, 2017, pp. 1–11. M. Lindorfer, A. Di Federico, F. Maggi, P. M. Comparetti, and S. Zanero, “Lines of malicious code: insights into the malicious software industry,” in Proceedings of the 28th Annual Computer Security Applications Conference, 2012, pp. 349–358. R. Al-Shaer, J. M. Spring, and E. Christou, “Learning the associations of mitre att ck adversarial techniques,” arXiv preprint arXiv:2005.01654, 2020. V. Legoy, M. Caselli, C. Seifert, and A. Peter, “Automated retrieval of att ck tactics and techniques for cyber threat reports,” arXiv preprint arXiv:2004.14322, 2020. B. Kolosnjaji, A. Zarras, G. Webster, and C. Eckert, “Deep learning for clas- sification of malware system call sequences,” in Australasian Joint Conference on Artificial Intelligence. Springer, 2016, pp. 137–149. J. Zhang, Z. Qin, H. Yin, L. Ou, and K. Zhang, “A feature-hybrid malware variants detection using cnn based opcode embedding and bpnn based api embedding,” Computers Security, vol. 84, pp. 376–392, 2019. K. Rieck, P. Trinius, C. Willems, and T. Holz a↵2n3, “Automatic analysis of malware behavior using machine learning,” J. Comput. Secur., vol. 19, no. 4, p. 639–668, Dec. 2011. D. Du, Y. Sun, Y. Ma, and F. Xiao, “A novel approach to detect malware variants based on classified behaviors,” IEEE Access, vol. 7, pp. 81 770–81 782, 2019. Falcon Sandbox. Online Available at: https://www.crowdstrike.com/endpoint-security-products/falcon-sandbox-malware-analysis/ CWSandbox. Online Available at https://www.rarst.net/web/cwsandbox/ W. Koehrsen, “Neural network embeddings explained,” Towards Data Science, 2018. Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical attention networks for document classification,” in Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 2016, pp. 1480–1489. A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013, pp. 6645–6649. I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in neural information processing systems, 2014, pp. 3104–3112. S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V. N. Venkatakrishnan, “Holmes: Real-time apt detection through correlation of suspicious information flows,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 1137–1152. Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural com- putation, vol. 9, no. 8, pp. 1735–1780, 1997. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008. VirusTotal. Online Available at: https://www.virustotal.com/gui/search/emotet/comments
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72327	-
dc.description.abstract	近幾年來惡意軟體增長迅速，許多反惡意軟體防禦性解決方案都專注於檢測惡意軟件家族，其中大多依賴於簽名的技術。但是，此類基於簽名的技術通常無法檢測到未知惡意軟體，並且由於惡意軟件家族無法清楚地描述特定惡意軟體樣本的攻擊行為，因此我們提出了一種新穎的自動惡意軟件行為分析模型，用於自動生成由ATT＆CK定義的行為技術（TTP），以此在系統上進行惡意軟體分析。給定一組惡意軟體報告，我們提出的方法將學習API calls與行為技術之間的關係，這些關係將一系列具有這些特徵的API calls關聯到行為技術中。此方法是基於MITER ATT＆CK的框架，該框架提供了常用的對策策略和技術的數據庫，它為我們提供了清晰的畫面來描述惡意軟體樣本的行為。在本文中，我們將惡意軟體在Cuckoo沙箱上執行以獲取其運行時行為。在執行結束時，Cuckoo沙箱會報告惡意軟體在執行過程中調用的API calls。但是，此報告為JSON格式，因使我們將其轉換為MIST格式以提取API calls。最後，我們建立了RNN和Seq2Seq模型，它們具有保留順序信息的能力。我們的評估結果表明，在於預測已知惡意軟體的行為技術達到94.95％的f1分數，在預測未知惡意軟體的行為技術達到99.40％的f1分數，這證明了深度學習方法在關聯API calls和行為技術的通用性和可行性。	zh_TW
dc.description.abstract	In recent years, Malicious software or malware has grown rapidly and many anti-malware defensive solutions focus on detecting the malware family and most of them rely on signature-based technique. However, such signature-based techniques often failed to detect the unknown malware and since the malware family cannot clearly describe the attack behaviors of a specific malware sample, we propose a novel Automatic Malware Behavior Analysis Model for automatically generating the techniques defined by ATT CK on a malware analysis system. Given a set of malware reports, the proposed method learns the relationship between API calls and Tactics, Techniques and Procedures (TTPs), which connect a series of API calls into a TTP with those characteristics. The method is based on the framework of MITRE ATT CK which provides a database of commonly observed countermeasure strategies and technologies, it supplies us a clear picture to describe the behavior of a malware sample. In this thesis, the malware is executed on the Cuckoo sandbox to obtain its run-time behavior. At the end of the execution, the Cuckoo sandbox reports the API calls invoked by the malware during execution. However, this report is in JSON format and has to be converted to MIST format to extract the API calls. Finally, we construct RNN and Seq2Seq models which have the ability to reserve sequential information. Our evaluation results show that 94.95% of the f1-score on predicting TTPs from known malware and 99.40% of the f1-score from unknown malware, which prove the universality and feasibility of deep learning methods in associating the API calls and TTPs.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T06:35:39Z (GMT). No. of bitstreams: 1 U0001-2408202015325900.pdf: 4750583 bytes, checksum: 623405c46405c490532e948c6140a359 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	ABSTRACT .................................. ii LISTOFTABLES .............................. v LISTOFFIGURES ............................. vi CHAPTER1 INTRODUCTION .................... 1 CHAPTER2 BACKGROUND ..................... 5 2.1 MalwareDetection........................... 5 2.1.1 SignatureBasedDetection .................. 5 2.1.2 BehaviorBasedDetection................... 6 2.2 MITREATT CK........................... 6 2.3 TheCyberKillChain......................... 10 CHAPTER3 RELATEDWORK.................... 13 3.1 MalwareAnalysis ........................... 13 3.2 MITREApplication.......................... 14 3.3 MalwareFamilyClassification .................... 15 3.4 MalwareBehaviorClustering..................... 17 CHAPTER4 SYSTEMDESIGN.................... 18 4.1 MonitoringofMalwareBehavior ................... 19 4.1.1 MalwareSandboxes ...................... 19 4.1.2 CuckooSandboxReport ................... 21 4.2 DataPreprocessing .......................... 23 4.3 TheMalwareInstructionSet ..................... 26 4.4 EmbeddingofMalwareBehavior................... 29 4.5 AutomaticAnalysisofMalwareBehavior . . . . . . . . . . . . . . 32 4.5.1 DeepLearningModel ..................... 32 4.5.2 TTPPrediction ........................ 35 4.5.3 Relationship between API Calls and TTPs . . . . . . . . . 36 CHAPTER5 PROPOSEDALGORITHM .............. 38 5.1 RecurrentNeuralNetworks...................... 38 5.2 GatedRecurrentNeuralNetworks .................. 39 5.2.1 LongShort-TermMemoryUnit ............... 40 5.2.2 GatedRecurrentUnit..................... 41 5.3 SequencetoSequence ......................... 42 5.4 Attention................................ 44 CHAPTER6 EXPERIMENTSANDRESULT . . . . . . . . . . . 47 6.1 DataSet ................................ 47 6.2 SystemSetupandEvaluation..................... 50 6.3 ArgumentsImportanceExperiment ................. 51 6.4 Family Characteristics Influence Experiment . . . . . . . . . . . . 52 6.4.1 Threatnaturecomparison. .................. 52 6.4.2 Familycomparisoninsamethreat.. . . . . . . . . . . . . . 53 6.4.3 Familyquantitiescomparison. ................ 54 6.5 ModelVisualization.......................... 55 CHAPTER 7 CONCLUSION AND FUTURE WORK . . . . . . 57 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
dc.language.iso	en
dc.subject	沙盒	zh_TW
dc.subject	動態分析	zh_TW
dc.subject	麥特	zh_TW
dc.subject	cuckoo sandbox	en
dc.subject	dynamic analysis	en
dc.subject	MITRE	en
dc.title	使用ATT CK框架進行惡意軟體行為分析	zh_TW
dc.title	Malware Behavior Analysis with ATT CK Framework	en
dc.type	Thesis
dc.date.schoolyear	109-2
dc.description.degree	碩士
dc.contributor.coadvisor	陳孟彰(Meng-Chang Chen)
dc.contributor.oralexamcommittee	黃意婷(Yi-Ting Huang),李育杰(Yuh-Jye Li)
dc.subject.keyword	麥特,動態分析,沙盒,	zh_TW
dc.subject.keyword	MITRE,dynamic analysis,cuckoo sandbox,	en
dc.relation.page	61
dc.identifier.doi	10.6342/NTU202004162
dc.rights.note	有償授權
dc.date.accepted	2020-08-28
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資料科學學位學程	zh_TW
顯示於系所單位：	資料科學學位學程

文件中的檔案：

檔案	大小	格式
U0001-2408202015325900.pdf 未授權公開取用	4.64 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。