Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21455
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor鍾嘉德(Char-Dir Chung)
dc.contributor.authorJu-Hsuan Hsiehen
dc.contributor.author謝儒萱zh_TW
dc.date.accessioned2021-06-08T03:34:36Z-
dc.date.copyright2019-08-06
dc.date.issued2019
dc.date.submitted2019-08-02
dc.identifier.citation[1] Ntu information networking group. [Online]. Avaliable: http://ws5.cc.ntu.edu.tw/ntucc/.
[2] Michael Finsterbusch, Chris Richter, Eduardo Rocha, Jean-Alexander Muller, and Klaus Hanssgen. A survey of payload-based traffic classification approaches. IEEE Communications Surveys & Tutorials, 16(2):1135–1156, 2014.
[3] AndrewWMoore and Denis Zuev. Internet traffic classification using bayesian analysis techniques. In ACM SIGMETRICS Performance Evaluation Review, volume 33, pages 50–60. ACM, 2005.
[4] Thuy TT Nguyen and Grenville Armitage. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials, 10(4):56–76, 2008.
[5] Arthur C Callado, Carlos Alberto Kamienski, G´eza Szab´o, Bal´azs P´eter Gero, Judith Kelner, Stenio FL Fernandes, and Djamel Fawzi Hadj Sadok. A survey on internet traffic identification. IEEE Communications Surveys and Tutorials, 11(3):37–52, 2009.
[6] Joseph Kampeas, Asaf Cohen, and Omer Gurewitz. Traffic classification based on zerolength packets. IEEE Transactions on Network and Service Management, 2018.
[7] tshark manual. [Online]. Avaliable: https://www.wireshark.org/docs/man-pages/tshark.html.
[8] sklearn.onehotencoder. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html.
[9] scikit-learn. [Online]. Avaliable: http://scikit-learn.org/stable/.
[10] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl. Algorithms for hyper-parameter optimization. In International Conference on Neural Information Processing Systems (NIPS), 2011.
[11] Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 6(1):3, 2012.
[12] sklearn.oneclasssvm. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html.
[13] sklearn.gaussianmixture. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html.
[14] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
[15] Trevor Hastie, Saharon Rosset, Ji Zhu, and Hui Zou. Multi-class adaboost. Statistics and its Interface, 2(3):349–360, 2009.
[16] Thomas Cover and Peter Hart. Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21–27, 1967.
[17] Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7:21, 2013.
[18] C. Nitesh. Data Mining for Imbalanced Datasets: An Overview. Nitesh V. Chawla, 2005.
[19] Leo Breiman. Classification and regression trees. Routledge, 2017.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21455-
dc.description.abstract網路流量的識別對於網路安全以及管理非常重要。這個技術可以幫助網路管理者找出網路中的異常。許多大型網站已經成為網路攻擊的目標,而當他們遭受到攻擊時,對外流出的流量行為可能會產生異常。在先前的文獻當中,網路流量識別使用到載荷檢查(payload inspection)以及基於同一條流量的特徵(flow-based features)辨識。然而,當封包是加密的時候,載荷檢查無法發揮它的作用,並且檢查載荷時會有隱私問題。再者,在訂定這些檢查規則時造成的負擔很重,也需要定期的更新。再基於同一條流量的特徵辨識時,需要蒐集多個封包來計算統計特徵,運算複雜度相當大,而且耗時。因此,我們提出一個只使用封包表頭來辨認封包是否正常的機制。由於只需要一個封包的表頭資訊來做辨認,不僅降低計算特徵時的負擔,也可降低封包辨認所需要的時間。zh_TW
dc.description.abstractNetwork traffic identification technique plays an important role in modern network security and management architectures. It can help the network manager to find out the anomalies in the network. Some large scale servers have become the targets of security attacks, and the behaviors of the servers may become abnormal. In previous works, network identification requires payload inspection and flow-based features collecting. Payload inspection cannot function when the payload is encrypted and may incur the privacy issue because the content should be scanned. Furthermore, it usually imposes heavy overhead to construct the rules. In the flow-based feature identification technique, the computation overhead to collect multiple packets and calculate the statistic features is very high and time-consuming. Therefore, we proposed a mechanism to determine whether the Internet traffics from large scale servers are normal or not by using the packet header information. Furthermore, because it only requires the information of one packet for detection, the detection time and the computation overhead can be reduced.en
dc.description.provenanceMade available in DSpace on 2021-06-08T03:34:36Z (GMT). No. of bitstreams: 1
ntu-108-R06942053-1.pdf: 2824845 bytes, checksum: 2eb19ed95ef7f7d30139c68c747deb18 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontentsContents
致謝i
中文摘要ii
Abstract iii
Contents iv
List of Figures vi
List of Tables viii
1 Introduction 1
2 Internet Environment and Data Format 4
3 Traffic Identification 10
3.1 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2 Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Identification Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Experiments 19
4.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Single Server IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Multiple Server IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.1 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 Feature Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Conclusion 37
Appendix A Experiment Results (raw data) 38
Appendix B Parameter Setting 47
Bibliograghy 51
dc.language.isoen
dc.title利用封包表頭識別來自於大型網站伺服器的封包流zh_TW
dc.titleIdentifying Traffic of Large Scale Web Server Using Packet
Header Information
en
dc.typeThesis
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.coadvisor林風(Phone Lin)
dc.contributor.oralexamcommittee林一平(Yi-Bing Lin),廖婉君(Wan-jiun Liao)
dc.subject.keyword機器學習,封包表頭資訊,網路流量辨識,台大網路校園流量,zh_TW
dc.subject.keywordmachine learning,packet header information,internet traffic identification,NTU campus traffic,en
dc.relation.page53
dc.identifier.doi10.6342/NTU201901028
dc.rights.note未授權
dc.date.accepted2019-08-05
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
2.76 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved