基於高階API執行序列之惡意程式家族特徵的自動化產生與分析

Wei-Jhih Chiu; 邱偉志

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70968

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	孫雅麗
dc.contributor.author	Wei-Jhih Chiu	en
dc.contributor.author	邱偉志	zh_TW
dc.date.accessioned	2021-06-17T04:46:15Z	-
dc.date.available	2020-08-08
dc.date.copyright	2018-08-08
dc.date.issued	2018
dc.date.submitted	2018-08-01
dc.identifier.citation	參考文獻 [1] McAfee. (2017). McAfee Labs Threat Report September 2017. Available: https://www.mcafee.com/us/resources/reports/rp-quarterly-threats-sept-2017.pdf [2] G. Szappanos. (2013). The PlugX malware revisited: introducing “Smoaler”. Available: https://sophosnews.files.wordpress.com/2013/07/sophosszappanosplugxrevisitedintroducingsmoaler-rev1.pdf [3] X. Li, P. K. Loh, and F. Tan, 'Mechanisms of polymorphic and metamorphic viruses,' in Intelligence and Security Informatics Conference (EISIC), 2011 European, 2011, pp. 149-154: IEEE. [4] S. M. Tabish, M. Z. Shafiq, and M. Farooq, 'Malware detection using statistical analysis of byte-level file content,' in Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics, 2009, pp. 23-31: ACM. [5] C. Wang, Z. Qin, J. Zhang, and H. Yin, 'A malware variants detection methodology with an opcode based feature method and a fast density based clustering algorithm,' in Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2016 12th International Conference on, 2016, pp. 481-487: IEEE. [6] Hex-Rays: IDA. Available: https://www.hex-rays.com/products/ida/ [7] S. Cesare, Y. Xiang, and W. Zhou, 'Malwise—an effective and efficient classification system for packed and polymorphic malware,' IEEE Transactions on Computers, vol. 62, no. 6, pp. 1193-1206, 2013. [8] R. Lyda and J. Hamrock, 'Using entropy analysis to find encrypted and packed malware,' IEEE Security & Privacy, vol. 5, no. 2, 2007. [9] PEiD. Available: https://www.aldeid.com/wiki/PEiD. [10] D. Uppal, R. Sinha, V. Mehra, and V. Jain, 'Malware detection and classification based on extraction of API sequences,' in Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on, 2014, pp. 2337-2342: IEEE. [11] B. Kolosnjaji, A. Zarras, G. Webster, and C. Eckert, 'Deep learning for classification of malware system call sequences,' in Australasian Joint Conference on Artificial Intelligence, 2016, pp. 137-149: Springer. [12] Q. Miao, J. Liu, Y. Cao, and J. Song, 'Malware detection using bilayer behavior abstraction and improved one-class support vector machines,' International Journal of Information Security, vol. 15, no. 4, pp. 361-379, 2016. [13] pefile. Available: https://github.com/erocarrera/pefile [14] Portable Executable. Available: https://en.wikipedia.org/wiki/Portable_Executable [15] A. Moser, C. Kruegel, and E. Kirda, 'Limits of static analysis for malware detection,' in Computer security applications conference, 2007. ACSAC 2007. Twenty-third annual, 2007, pp. 421-430: IEEE. [16] C. Willems, T. Holz, and F. Freiling, 'Toward automated dynamic malware analysis using cwsandbox,' IEEE Security & Privacy, vol. 5, no. 2, 2007. [17] Cuckoo Sandbox. Available: https://cuckoosandbox.org/ [18] H. Yin and D. Song, 'Temu: Binary code analysis via whole-system layered annotative execution,' EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2010-3, 2010. [19] F. Bellard, 'QEMU, a fast and portable dynamic translator,' in USENIX Annual Technical Conference, FREENIX Track, 2005, pp. 41-46. [20] Sequence Motif. Available: https://en.wikipedia.org/wiki/Sequence_motif [21] A. Narayanan, Y. Chen, S. Pang, and B. Tao, 'The effects of different representations on malware motif identification,' in Computational Intelligence and Security (CIS), 2012 Eighth International Conference on, 2012, pp. 86-90: IEEE. [22] D. W. Mount, 'Bioinformatics: sequence and genome analysis,' in Bioinformatics: sequence and genome analysis2nd ed.: Cold Spring Harbor Laboratory Press, 2004. [23] S. B. Needleman and C. D. Wunsch, 'A general method applicable to the search for similarities in the amino acid sequence of two proteins,' Journal of molecular biology, vol. 48, no. 3, pp. 443-453, 1970. [24] 姜立垣, '在 Windows 平台上的惡意軟體家族的基序 API 序列分析,' 臺灣大學資訊管理學研究所學位論文, pp. 1-69, 2016. [25] 薛筑允, '自動化生成惡意軟體家族導致系統狀態改變之活動軌跡,' 台灣大學資訊管理學研究所學位論文, 2018 [26] CME Identifiers. Available: https://cme.mitre.org/cme/ [27] M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario, 'Automated classification and analysis of internet malware,' in International Workshop on Recent Advances in Intrusion Detection, 2007, pp. 178-197: Springer. [28] Symantec Corporation Available: https://www.symantec.com [29] TrendMicro Threat Encyclopedia. Available: https://www.trendmicro.com/vinfo/us/threat-encyclopedia/ [30] Microsoft Threat Encyclopedia. Available: https://www.microsoft.com/en-us/wdsi/threats/ [31] National Center for High-performance Computing, NCHC. Available: https://www.nchc.org.tw/tw/ [32] Institute for Information Industry, III. Available: https://www.iii.org.tw/ [33] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, 'Support vector machines,' IEEE Intelligent Systems and their applications, vol. 13, no. 4, pp. 18-28, 1998. [34] A. McCallum and K. Nigam, 'A comparison of event models for naive bayes text classification,' in AAAI-98 workshop on learning for text categorization, 1998, vol. 752, no. 1, pp. 41-48: Citeseer. [35] S. R. Safavian and D. Landgrebe, 'A survey of decision tree classifier methodology,' IEEE transactions on systems, man, and cybernetics, vol. 21, no. 3, pp. 660-674, 1991. [36] N. S. Altman, 'An introduction to kernel and nearest-neighbor nonparametric regression,' The American Statistician, vol. 46, no. 3, pp. 175-185, 1992. [37] Scikit-learn. Available: http://scikit-learn.org/stable/index.html [38] TrendMicro Threat Encyclopedia–PE_Mandangel.D. Available: www.trendmicro.tw/vinfo/tw/threat-encyclopedia/malware/PE_MADANGEL.D
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70968	-
dc.description.abstract	近年來惡意程式所造成的威脅快速增加，分析並瞭解惡意程式的特徵將對惡意程式的偵測和防禦有所助益。而目前市面上的各家防毒軟體廠商均會依據所觀察到的惡意程式特徵為樣本貼上不同的家族標籤，本研究將依據此標籤進行各家族的行為分析。由於單一個惡意程式樣本可能會參雜許多混淆的行為意圖，因此相較於單一樣本的分析，我們著重在找尋同家族中的一群惡意程式中的共同行為。我們設計並實作了一個以API Call Sequence為基礎的階層式分群演算法－RasMMA，輸入一群惡意程式的動態側錄結果，此演算法能夠依據這些側錄內容將惡意程式樣本分群，並且輸出每一群惡意程式的具語義共同行為，這些共同行為即可作為該家族的特徵行為群。同時在我們的研究過程中發現同一個家族內的樣本，其行為也可能具有多元性，因此一個家族可能會擁有一個或多個的共同行為群，這些共同行為群甚至可能會有跨家族的現象。除了設計演算法來找到各惡意程式家族的特徵之外，本研究也嘗試將這些特徵用於家族後代樣本的偵測之中，並且證明我們的方法在惡意程式行為序列資料的分類中可以比其它傳統資料探勘方法具有更好的效果。	zh_TW
dc.description.abstract	Recent years, the threats from malware are increasing in the world. It is important if we analyze the malwares and extract their signatures. The malware threat detection and defense will benefit from that.This research collected the malware family labels from anti-virus vendors and analyzed the behavior intents of malware family. We designed a API Call Sequence-based clustering algorithm – RasMMA, which could extract the common signature of a group of malwares. If we input some malware profiles, RasMMA algorithm could cluster the malware samples and output the common behavior of each cluster. The cluster common behavior is semantic-based which human experts could analyze the intent that malwares done. We could see the common behavior as the signature of malware family. Besides, we also found that malware family is pluralistic. The behavior clusters might different to each other in one family. Even though some clusters are cross-family clusters which behavior is similar to other families’ behavior.In the research, we also apply the behavior cluster to family sample detection. We found that our method had a better performance than other traditional data mining method in the time series malware data classification.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T04:46:15Z (GMT). No. of bitstreams: 1 ntu-107-R05725016-1.pdf: 5833674 bytes, checksum: 9f96c45e353f5d32f5968df11156c9cc (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	口試委員審定書 ⅰ 謝辭 ⅱ 中文摘要 ⅲ 英文摘要 ⅳ 第一章介紹 1 1.1 研究動機 1 1.2 研究目的 2 1.3 研究貢獻 3 第二章背景知識與相關文獻 4 2.1 背景知識 4 2.2 相關文獻探討 7 第三章惡意程式家族代表行為序特徵萃取演算法 10 3.1 動態執行行為側錄與分析系統 10 3.2 VMI PROFILING 2.0 11 3.3 WINNOWING 11 3.4 RASMMA 18 3.5 BEGC 26 第四章實驗 27 4.1 惡意程式家族歸類 27 4.2 萃取家族特徵 34 4.3 家族後代變種偵測 42 4.4 跨家族行為群分析 48 第五章結論 51 參考文獻 52
dc.language.iso	zh-TW
dc.subject	家族	zh_TW
dc.subject	惡意程式	zh_TW
dc.subject	行為	zh_TW
dc.subject	序列比對	zh_TW
dc.subject	分群	zh_TW
dc.subject	動態分析	zh_TW
dc.subject	共同特徵	zh_TW
dc.subject	malware family	en
dc.subject	malware	en
dc.subject	sequence comparison	en
dc.subject	clustering	en
dc.subject	dynamic analysis	en
dc.subject	common behavior signature	en
dc.title	基於高階API執行序列之惡意程式家族特徵的自動化產生與分析	zh_TW
dc.title	Automated Malware Family Signature Generation based on Runtime API Call Sequence	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李漢銘,李育杰,陳孟彰,蕭舜文
dc.subject.keyword	惡意程式,家族,共同特徵,動態分析,分群,序列比對,行為,	zh_TW
dc.subject.keyword	malware,common behavior signature,malware family,dynamic analysis,clustering,sequence comparison,	en
dc.relation.page	55
dc.identifier.doi	10.6342/NTU201802357
dc.rights.note	有償授權
dc.date.accepted	2018-08-02
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 未授權公開取用	5.7 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。