Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21479
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林宗男
dc.contributor.authorWei-Chieh Tsengen
dc.contributor.author曾煒傑zh_TW
dc.date.accessioned2021-06-08T03:35:17Z-
dc.date.copyright2019-08-05
dc.date.issued2019
dc.date.submitted2019-07-31
dc.identifier.citation[1] Adguard. Mining Report. https://adguard.com/en/blog/crypto-mining-fever.html, 2017. [Online; accessed 07-May-2019].
[2] F. V. Alejandre, N. C. Cortes, and E. A. Anaya. Feature selection to detect botnets using machine learning algorithms. In 2017 International Conference on Electronics, Communications and Computers (CONIELECOMP), pages 1–7. IEEE, 2017.
[3] Alexa. Alexa Top Sites. https://www.alexa.com/topsites, 2019. [Online; accessed 07-May-2019].
[4] B. Anderson and D. McGrew. Identifying encrypted malware traffic with contextual flow data. In Proceedings of the 2016 ACM workshop on artificial intelligence and security, pages 35–46. ACM, 2016.
[5] B. Anderson, S. Paul, and D. McGrew. Deciphering malware’s use of tls (without decryption). Journal of Computer Virology and Hacking Techniques, pages 1–17, 2016.
[6] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[7] D. Balzarotti, M. Cova, C. Karlberger, E. Kirda, C. Kruegel, and G. Vigna. Efficient detection of split personalities in malware. In NDSS. Citeseer, 2010.
[8] E. B. Beigi, H. H. Jazi, N. Stakhanova, and A. A. Ghorbani. Towards effective feature selection in machine learning-based botnet detection approaches. In 2014 IEEE Conference on Communications and Network Security, pages 247–255. IEEE, 2014.
[9] R. Bortolameotti, T. van Ede, M. Caselli, M. H. Everts, P. Hartel, R. Hofstede, W. Jonker, and A. Peter. Decanter: Detection of anomalous outbound http traffic by passive application fingerprinting. In Proceedings of the 33rd Annual Computer Security Applications Conference, pages 373–386. ACM, 2017.
[10] L. Breiman. Bagging predictors. Machine learning, 24(2):123–140, 1996.
[11] Bro. Bro. https://www.zeek.org/, 2019. [Online; accessed 07-May-2019].
[12] CAPE. CAPE sandbox. https://github.com/ctxis/CAPE, 2019. [Online; accessed 07-May-2019].
[13] C.-K. Chiu, H.-H. Chang, C.-H. Mao, and T.-E. Wei. Counterfeit fingerprint detection of outbound http traffic with graph edit distance. In 2018 IEEE Conference on Dependable and Secure Computing (DSC), pages 1–2. IEEE, 2018.
[14] Cisco. Joy. https://github.com/cisco/joy, 2019. [Online; accessed 07-May-2019].
[15] R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160–167. ACM, 2008.
[16] Cuckoo. Cuckoo sandbox. https://cuckoosandbox.org/, 2019. [Online; accessed 07-May-2019].
[17] Cyren. HTTPS Report. https://www.cyren.com/blog/articles/over-one-third-of-malware-uses-https, 2019. [Online; accessed 07-May-2019].
[18] A. Dainotti, A. Pescape, and K. C. Claffy. Issues and future directions in traffic classification. IEEE network, 26(1):35–40, 2012.
[19] L. Deri, M. Martinelli, T. Bujlow, and A. Cardigliano. ndpi: Open-source highspeed deep packet inspection. In 2014 International Wireless Communications and Mobile Computing Conference (IWCMC), pages 617–622. IEEE, 2014.
[20] T. G. Dietterich. Machine-learning research. AI magazine, 18(4):97–97, 1997.
[21] distilnetworks. Bad bot report. https://resources.distilnetworks.com/white-paper-reports/bad-bot-report-2019, 2019. [Online; accessed 07-May-2019].
[22] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani. Characterization of encrypted and vpn traffic using time-related. In Proceedings of the 2nd international conference on information systems security and privacy (ICISSP), pages 407–414, 2016.
[23] M. Finsterbusch, C. Richter, E. Rocha, J.-A. Muller, and K. Hanssgen. A survey of payload-based traffic classification approaches. IEEE Communications Surveys & Tutorials, 16(2):1135–1156, 2013.
[24] Y. Freund, R. E. Schapire, et al. Experiments with a new boosting algorithm. In icml, volume 96, pages 148–156. Citeseer, 1996.
[25] S. Garcia, M. Grill, J. Stiborek, and A. Zunino. An empirical comparison of botnet detection methods. computers & security, 45:100–123, 2014.
[26] Google. Transparency Report. https://transparencyreport.google.com/https/overview?hl=en/, 2019. [Online; accessed 07-May-2019].
[27] G. Gu, P. A. Porras, V. Yegneswaran, M. W. Fong, and W. Lee. Bothunter: Detecting malware infection through ids-driven dialog correlation. In USENIX Security Symposium, volume 7, pages 1–16, 2007.
[28] Helpnet. DoS Report. https://www.helpnetsecurity.com/2019/02/11/ddos-attack-volumes-grew-by-194-in-12-months/, 2019.[Online; accessed 07-May-2019].
[29] T. Holz, M. Engelberth, and F. Freiling. Learning more about the underground economy: A case-study of keyloggers and dropzones. In European Symposium on Research in Computer Security, pages 1–18. Springer, 2009.
[30] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis & Machine Intelligence, (11):1254–1259, 1998.
[31] H. H. Jazi, H. Gonzalez, N. Stakhanova, and A. A. Ghorbani. Detecting http-based application layer dos attacks on web servers in the presence of sampling. Computer Networks, 121:25–36, 2017.
[32] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Largescale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1725–1732, 2014.
[33] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
[34] A. H. Lashkari, G. Draper-Gil, M. S. I. Mamun, and A. A. Ghorbani. Characterization of tor traffic using time based features. In ICISSP, pages 253–262, 2017.
[35] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
[36] M. Lin, Q. Chen, and S. Yan. Network in network. arXiv preprint arXiv:1312.4400, 2013.
[37] S.-T. Liu, Y.-M. Chen, and S.-J. Lin. A novel search engine to uncover potential victims for apt investigations. In IFIP International Conference on Network and Parallel Computing, pages 405–416. Springer, 2013.
[38] L. v. d. Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605, 2008.
[39] malwaretraffic. Malware Traffic Analysis Net. http://www.malware-traffic-analysis.net/, 2019. [Online; accessed 07-May2019].
[40] A. W. Moore and K. Papagiannaki. Toward the accurate identification of network applications. In International Workshop on Passive and Active Network Measurement, pages 41–54. Springer, 2005.
[41] D. Opitz and R. Maclin. Popular ensemble methods: An empirical study. Journal of artificial intelligence research, 11:169–198, 1999.
[42] PIL. PIL libary. https://pillow.readthedocs.io/en/stable/, 2019. [Online; accessed 07-May-2019].
[43] R. Polikar. Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6(3):21–45, 2006.
[44] L. Rokach. Ensemble-based classifiers. Artificial Intelligence Review, 33(1-2):1–39, 2010.
[45] R. E. Schapire. The strength of weak learnability. Machine learning, 5(2):197–227, 1990.
[46] K. Singh, P. Singh, and K. Kumar. User behavior analytics-based classification of application layer http-get flood attacks. Journal of Network and Computer Applications, 112:97–114, 2018.
[47] Snort. Snort Rule. https://www.snort.org/, 2019. [Online; accessed 07- May-2019].
[48] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.
[49] F. Stˇrasak. Detection of HTTPS MalwareTraffic. ´ https://dspace.cvut.cz/bitstream/handle/10467/68528/F3-BP-2017-Strasak-Frantisek-strasak_thesis_2017.pdf, 2019. [Online; accessed 07-May-2019].
[50] Suricata. Suricata Rule. https://suricata-ids.org/, 2019. [Online; accessed 07-May-2019].
[51] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
[52] G. W. Taylor, R. Fergus, Y. LeCun, and C. Bregler. Convolutional learning of spatiotemporal features. In European conference on computer vision, pages 140–153. Springer, 2010.
[53] Verizon. Data breach Report. https://enterprise.verizon.com/resources/reports/DBIR_2018_Report.pdf, 2018. [Online; accessed 07-May-2019].
[54] J. Wang and I. C. Paschalidis. Botnet detection based on anomaly and community detection. IEEE Transactions on Control of Network Systems, 4(2):392–404, 2016.
[55] T. Windeatt and G. Ardeshir. Decision tree simplification for classifier ensembles. International Journal of Pattern Recognition and Artificial Intelligence, 18(05):749–776, 2004.
[56] D. H. Wolpert. Stacked generalization. Neural networks, 5(2):241–259, 1992.
[57] H. Yakura, S. Shinozaki, R. Nishimura, Y. Oyama, and J. Sakuma. Malware analysis of imaged binary samples by convolutional neural network with attention mechanism. In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, pages 127–134. ACM, 2018.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21479-
dc.description.abstract互聯網已成為大規模全球通訊的關鍵推動因素,每天提供穩定的網路服務非常重要,隨著互聯網的使用不斷增長,有效管理融合它的底層網路至關重要。網路流量分類在此管理中是很重要的課題,包含提供服務質量(QoS)與預測未來趨勢以及檢測潛在的安全威脅。出於這些原因,準確的網路流量分類對於網路服務提供商(ISP),大型企業公司和政府機構而言非常重要。近年來,由於加密網路流量的增加趨勢,無論是出於安全性還是隱藏惡意的目的,當前的網路流量分類方法已經變得不那麼有效。因此,在當今的網路中,需要更有效的分類演算法來處理這個問題。越來越多的人建議使用機器學習來對加密的網絡流量進行分類,雖然有許多技術可用於應用機器學習來實現網路流量分類,但大多數工作都嚴重依賴於手工選取的特徵,或者只能處理離線流量分類。為了擺脫上述弱點,在本篇論文中,我提出了一個基於卷積神經網絡(CNNs)與集成學習(Ensemble Learning)的架構,Packet2Img。此架構將網路流量轉換為圖片,可以完全獲取不同應用程式或惡意攻擊的靜態和動態行為,因此可以避免手工取特徵可能導致重要訊息遺失的現象。在本篇論文中有使用的資料集包含ISCX VPN-nonVPN資料集,CTU-13資料集和CAPE沙箱所收集的惡意程式網路流量。在所有實驗中,實驗結果證明Packet2Img此方法在100*100的最佳圖像尺寸和調整過的實驗架構中能夠滿足實際應用的精確度要求,也有很高的可擴展性。從實驗結果來看,該方法的分類精準度比使用手工選取特徵的傳統方法高出約10%。zh_TW
dc.description.abstractThe Internet has become a key enabler of large-scale global communications, and it is important to provide an immeasurable number of services every day. As the use of the Internet continues to grow, it is critical to effectively manage the underlying network that converges it. Network traffic classification plays a vital role in this management, providing quality of service(QoS), predicting future trends, and detecting potential security threats. For these reasons, accurate network traffic classification is important for Internet Service Providers (ISPs), large enterprise companies, and government agencies. Current network traffic classification methods have become less effective in recent years due to the increasing trend of encrypted network traffic, whether for security, priority or malicious purposes. Therefore, in today's networks, more efficient classification algorithms are needed to handle these conditions. More and more people are proposing to use machine learning to classify encrypted network traffic. While there are many techniques for applying machine learning to implement IP traffic classification, most works are heavily dependent on handcrafted features or can only handle offline traffic classification. In order to get rid of the above weaknesses, in this thesis, we present a convolutional neural networks (CNNs) with ensemble on traffic classification framework named Packet2Img. This framework converts network flows into images, fully capturing the static and dynamic behavior of different applications or malicious attack, and avoiding the use of handcrafted features that can lead to information loss. The method is validated with dataset which contains ISCX VPN-nonVPN dataset, CTU-13 dataset and malicious flows collected by CAPE sandbox. Among all of the experiments, with the best image size chosen and the fine-tuned model, the experiment results show that the method can satisfy the accuracy requirement of practical application and has high scalability. From the experimental results, the classification accuracy of this method is about 10 percent higher than the traditional method of using handcrafted features.en
dc.description.provenanceMade available in DSpace on 2021-06-08T03:35:17Z (GMT). No. of bitstreams: 1
ntu-108-R06942062-1.pdf: 6090776 bytes, checksum: e4bad513ad22faaeb5cd6b9afe4b4c21 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents口試委員會審定書 i
致謝 iii
中文摘要 v
Abstract vii
1 Introduction 1
2 Related Works 5
2.1 Joy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Packet Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Sequence of Packet Lengths and Times (SPLT) . . . . . . . . . . 6
2.1.3 Byte Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 TLS Information . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Cicflowmeter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Bro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Deep Learning Method on Malware Detection . . . . . . . . . . . . . . . 11
3 Methodology 13
3.1 Malware’s Network Behaviors . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Convolutional Layer . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2 Pooling Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Fully Connected layer . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Ensemble learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.1 Bagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.2 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.3 Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 Packet2Img . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4.1 Session/Flow Generation . . . . . . . . . . . . . . . . . . . . . . 23
3.4.2 Background Traffic Remove . . . . . . . . . . . . . . . . . . . . 25
3.4.3 Image Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.4 10 - class Ensemble Classifier . . . . . . . . . . . . . . . . . . . 26
4 Dataset Introduction 31
4.1 Botnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Data Exfiltration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 DoS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Exploit Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 Generic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.6 Malspam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.7 Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.8 Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.9 Ransomware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.10 Trojan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Experimental results 57
5.1 Botnet Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.1 CTU Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.2 UNB Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Data Exifiltration Detection . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 DoS Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4 Malicious Benign Binary Classification . . . . . . . . . . . . . . . . . . 60
5.5 10 class classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.6 Visualization of features in CNN using t-SNE and PCA . . . . . . . . . . 66
5.7 Saliency Map Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.8 Zero-shot Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6 Conclusion 79
Bibliography 81
dc.language.isoen
dc.title基於深度學習之惡意流量偵測zh_TW
dc.titleDeep Learning for Malicious Flow Detectionen
dc.typeThesis
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.oralexamcommittee鄧惟中,陳俊良,蔡子傑
dc.subject.keyword深度學習,惡意流量偵測,網路流量分類,卷積神經網絡,集成學習,惡意程式,zh_TW
dc.subject.keywordDeep Learning,Malicious Flow Detection,IP Traffic Classification,Convolutional Neural Networks,Ensemble Learning,Malware,en
dc.relation.page85
dc.identifier.doi10.6342/NTU201902120
dc.rights.note未授權
dc.date.accepted2019-07-31
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
5.95 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved