Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52310
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪士灝
dc.contributor.authorWan-Ting Yehen
dc.contributor.author葉婉婷zh_TW
dc.date.accessioned2021-06-15T16:11:35Z-
dc.date.available2017-08-31
dc.date.copyright2015-08-31
dc.date.issued2015
dc.date.submitted2015-08-18
dc.identifier.citationBibliography
[1] “Accelerate Machine Learning with the cuDNN Deep Neural Network Library,” https://public.gdatasoftware.com/Presse/Publikationen/Malware_Reports/G_DATA_ MobileMWR_Q1_2015_US.pdf.
[2] A. P. Felt, M. Finifter, E. Chin, S. Hanna, and D. Wagner, “A survey of mobile mal- ware in the wild,” in Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices. ACM, 2011, pp. 3–14.
[3] “Kantar worldpanel comtech’s smartphone os market share data q4 2013,” http://www. kantarworldpanel.com/smartphone-os-market-share/.
[4] “Trendlabs3Q 2012 SECURITY ROUNDUP Android Under Siege: Popularity Comes at a Price,” http://www.trendmicro.com/cloud-content/us/pdfs/security-intelligence/reports/ rpt-3q-2012-security-roundup-android-under-siege-popularity-comes-at-a-price.pdf.
[5] M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang, “Riskranker: scalable and accurate zero-day android malware detection,” in Proceedings of the 10th international conference on Mobile systems, applications, and services. ACM, 2012, pp. 281–294.
[6] D.-J. Wu, C.-H. Mao, T.-E. Wei, H.-M. Lee, and K.-P. Wu, “Droidmat: Android malware detection through manifest and api calls tracing,” in Information Security (Asia JCIS), 2012 Seventh Asia Joint Conference on. IEEE, 2012, pp. 62–69.
37
[7] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, “Drebin: Effective and explainable detection of android malware in your pocket,” in Proceedings of the Annual Symposium on Network and Distributed System Security (NDSS), 2014.
[8] B. Amos, H. Turner, and J. White, “Applying machine learning classifiers to dynamic android malware detection at scale,” in Wireless Communications and Mobile Computing Conference (IWCMC), 2013 9th International. IEEE, 2013, pp. 1666–1671.
[9] W.-C. Wu and S.-H. Hung, “Droiddolphin: a dynamic android malware detection frame- work using big data and machine learning,” in Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems. ACM, 2014, pp. 247–252.
[10] G. E. Dahl, J. W. Stokes, L. Deng, and D. Yu, “Large-scale malware classification using random projections and neural networks,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 3422–3426.
[11] W. Yu, L. Ge, G. Xu, and X. Fu, “Towards neural network based malware detection on android mobile devices,” in Cybersecurity Systems for Human Cognition Augmentation. Springer, 2014, pp. 99–117.
[12] “DroidBox,” https://code.google.com/p/droidbox/.
[13] “TaintDroid: Realtime Privacy Monitoring on Smartphones,” http://appanalysis.org.
[14] P. O’Kane, S. Sezer, and K. McLaughlin, “N-gram density based malware detection,” in Computer Applications & Research (WSCAR), 2014 World Symposium on. IEEE, 2014, pp. 1–6.
[15] T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman, “Project adam: Building an efficient and scalable deep learning training system.”
[16] Nvidia, “Accelerate Machine Learning with the cuDNN Deep Neural Network Library,” http://devblogs.nvidia.com/parallelforall/ accelerate-machine-learning-cudnn-deep-neural-network-library/.
38
[17] D. Cires ̧an, U. Meier, J. Masci, and J. Schmidhuber, “Multi-column deep neural network for traffic sign classification,” Neural Networks, vol. 32, pp. 333–338, 2012.
[18] T. N. Sainath, A.-r. Mohamed, B. Kingsbury, and B. Ramabhadran, “Deep convolutional neural networks for lvcsr,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 8614–8618.
[19] S.-J. Chang and S.-H. Hung, “Ape: A Smart Automatic Testing Environment for Android Malware,” http://www.airitilibrary.com/Publication/alDetailedMesh1?DocID= U0001-1908201316171100.
[20] “Caffe | Deep Learning Framework,” http://caffe.berkeleyvision.org.
[21] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representa-
tions in vector space,” arXiv preprint arXiv:1301.3781, 2013.
[22] “APIMonitor,” https://code.google.com/p/droidbox/wiki/APIMonitor.
[23] “Android Logcat,” http://developer.android.com/tools/help/logcat.html.
[24] “Android Monkeyrunner,” http://developer.android.com/tools/help/monkey.html.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52310-
dc.description.abstract現今行動裝置應用越來越普及,豐富且便利了我們的生活,然而,隨著應用程式的增加,惡意的應用程式也逐漸增加。根據 G DATA 的統計顯示,有超過一半的惡意應用程式是針對財務方面的應用,安裝了這類的應用程式,會造成使用者金錢損失。機器學習已經被廣泛地應用在判斷惡意程式上,大部份的論文使用 trigram 來抽取應用程式的行為,作為機器學習的輸入,但只有少數的論文在使用巨量的資料下能取得良好的準確率。
本論文針對 trigram 的缺點探討,並提出嶄新的輸入格式,搭配卷積類神經網路(CNN)作為機器學習的演算法來改善低準確率的問題。就我們所知,本論文是第一個將卷積類神經網路演算法應用在惡意程式偵測應用,且針對整個網路架構進行完整討論的論文。使用我們所設計的輸入格式,卷積類神經網路能達到類似 k-skip-n-gram 的效果,學習到更複雜的行為來偵測惡意程式。
根據我們的實驗結果,新的輸入格式在不同配置下的卷積類神經網路架構,都能達到很好且穩定的準確率。在使用 32,000 個應用程式下,本論文所提出的架構能達到 93.012% 的預測準確度、12.9% 的誤判率,同時我們整合卷積類神經網路和 SVM 兩個不同特性的機器學習演算法,能有效降低誤判率到達 3%。最後,我們將本論文的架構與 NVIDIA 低功耗的開發板 Jetson-TK1 結合,進行惡意程式學習與預測,雖然機器學習的訓練時間增加,但有效地節省了整體架構的耗電量。
zh_TW
dc.description.abstractMobile applications are getting more and more popular nowadays with various kinds of applications to make our lives more convenient.Unfortunately, as the number of applications grows, malicious applications, also known as malware, arise as well.In addition, more than a half of malware are financially motivated and cause huge loss of money according to the statistics of G DATA in 2015.While machine learning techniques have been adopted to identify malware, most of the prior works use trigram as the input format to extract the behavior patterns for mobile applications, but only a few of them obtain good performance with a large dataset.
In this thesis, we discuss the weaknesses of the trigram-based machine learning methods and further improve the accuracy of malware detection by adding a new machine learning method based on the convolutional neural network (CNN) with a novel flattened input format.To our knowledge, this is the first work to discuss the usability of CNN on malware detection.
With the proposed flattened input format, our CNN scheme can perform a k-skip-n-gram dimensionality reduction which learns more flexible and complex patterns to detect different types of malware from the trigram-based methods.
Our experimental results show that the flattened input format yield good and stable accuracies with a simple topology design of the CNN scheme under different configurations.With 32,000 applications in our training set, CNN achieves 93.01% prediction accuracy and 12.9% FNR. After looking into the results of CNN, we can reduce FNR to 3% by using aggregation with SVM while retaining a similar accuracy.We demonstrate that running CNN on NVIDIA Jetson-TK1 further saves a half of power consumption comparing to the modern graphic cards, which reveals a new application scenario with low-cost, pervasive malware detection even on mobile platforms.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T16:11:35Z (GMT). No. of bitstreams: 1
ntu-104-R02922118-1.pdf: 1100222 bytes, checksum: 6e0c89ac67cbb3efebecbd1db3cda6bd (MD5)
Previous issue date: 2015
en
dc.description.tableofcontents致謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
中文摘要................................................................ ii
Abstract................................................................. iii
1 Introduction.......................................................... 1
1.1 Motivation......................................................... 1
1.2 ThesisOrganization ................................................. 3
RelatedWorks ........................................................ 4
2.1 StaticandDynamicAnalysis .......................................... 4
2.2 TaintDroid......................................................... 4
2.3 MalwareDetection.................................................. 5
2.4 TheOpportunityforCNN............................................. 5
3 DesignofCNN........................................................ 8
3.1 FlattenedData...................................................... 9
3.2 ArchitectureofCNN................................................. 10
3.2.1 InputOrder: OpportunityorChallenge............................... 11
3.2.2 KernelSize: BeyondSquare ....................................... 12
3.2.3 InnerProductLayer.............................................. 13
3.3 Re-train........................................................... 15
4 EvaluationResult...................................................... 17
4.1 ExperimentalSetup.................................................. 17
4.2 Dataset........................................................... 17
4.3 Metrics ........................................................... 18
4.4 ExperimentalResult................................................. 18
4.4.1 EffectofUnbalancedDataset ...................................... 18
4.4.2 QuanityofData:Balanced ......................................... 19
4.5 Comparison........................................................ 19
4.6 Aggregation........................................................ 21
4.7 PredictiononJetsonTK1............................................. 22
5 BackgroundKnowledges................................................ 32
5.1 MalwareAnalysis................................................... 32
5.1.1 StaticAnalysis.................................................. 32
5.1.2 DynamicAnalysis ............................................... 32
5.2 Preprocessing ...................................................... 33
5.3 Emulation......................................................... 33
5.4 FeatureExtraction................................................... 34
6 ConclusionandFutureWork............................................ 36
Bibliography............................................................. 37
List of Figures
Figure2.1 Convolution Operator .......................................... 6
Figure2.2 Pooling Operator.............................................. 7
Figure3.1 Architecture.................................................. 9
Figure3.2 CNN Architecture............................................. 9
Figure3.3 A Example of Input Feature for CNN. ............................. 10
Figure3.4 Three kinds of Flattened Data.................................... 11
Figure3.5 Comparison of Sensitivity....................................... 14
Figure3.6 Adding a Hidden Layer......................................... 14
Figure4.1 Benign/Malicious Ratio......................................... 19
Figure4.2 Quantity of Data:Banlanced...................................... 20
Figure4.3 Different Threshold of Confidence ................................ 22
Figure5.1 The Application Behavior Log ................................... 35
List of Tables
Table3.1 Convolution Kernel Size and Ordering.............................. 13
Table3.2 Re-train Time with Accuracy ..................................... 15
Table4.1 The Result of DroidDolphin...................................... 20
Table4.2 Comparison Performance of DroidDolphin .......................... 20
Table4.3 Aggregation(1) ................................................ 21
Table4.4 Aggregation(2) ................................................ 21
Table4.5 Comparison Performance with TK1 and PC.......................... 23
Table4.6 Different Aggregation with SVM and CNN.......................... 23
Table5.1 SensitiveAPI.................................................. 33 Table5.2 TagsofDroidBox .............................................. 34
dc.language.isoen
dc.subject平坦化輸入zh_TW
dc.subject深度類神經網路zh_TW
dc.subject卷積類神經網路zh_TW
dc.subject惡意程式偵測zh_TW
dc.subjectSVMzh_TW
dc.subject整合zh_TW
dc.subjectAggregationen
dc.subjectFlattened Eventsen
dc.subjectDeep Neural Networken
dc.subjectConvolutional Neural Networken
dc.subjectMalware Detectionen
dc.subjectSVMen
dc.title利用機器學習之Android惡意程式偵測架構zh_TW
dc.titleEffectively Detecting Android Malware with Machine Learningen
dc.typeThesis
dc.date.schoolyear103-2
dc.description.degree碩士
dc.contributor.oralexamcommittee林軒田,徐宏民,涂嘉恆,林風
dc.subject.keyword深度類神經網路,卷積類神經網路,惡意程式偵測,SVM,整合,平坦化輸入,zh_TW
dc.subject.keywordDeep Neural Network,Convolutional Neural Network,Malware Detection,SVM,Aggregation,Flattened Events,en
dc.relation.page39
dc.rights.note有償授權
dc.date.accepted2015-08-18
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-104-1.pdf
  未授權公開取用
1.07 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved