請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21455完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 鍾嘉德(Char-Dir Chung) | |
| dc.contributor.author | Ju-Hsuan Hsieh | en |
| dc.contributor.author | 謝儒萱 | zh_TW |
| dc.date.accessioned | 2021-06-08T03:34:36Z | - |
| dc.date.copyright | 2019-08-06 | |
| dc.date.issued | 2019 | |
| dc.date.submitted | 2019-08-02 | |
| dc.identifier.citation | [1] Ntu information networking group. [Online]. Avaliable: http://ws5.cc.ntu.edu.tw/ntucc/.
[2] Michael Finsterbusch, Chris Richter, Eduardo Rocha, Jean-Alexander Muller, and Klaus Hanssgen. A survey of payload-based traffic classification approaches. IEEE Communications Surveys & Tutorials, 16(2):1135–1156, 2014. [3] AndrewWMoore and Denis Zuev. Internet traffic classification using bayesian analysis techniques. In ACM SIGMETRICS Performance Evaluation Review, volume 33, pages 50–60. ACM, 2005. [4] Thuy TT Nguyen and Grenville Armitage. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials, 10(4):56–76, 2008. [5] Arthur C Callado, Carlos Alberto Kamienski, G´eza Szab´o, Bal´azs P´eter Gero, Judith Kelner, Stenio FL Fernandes, and Djamel Fawzi Hadj Sadok. A survey on internet traffic identification. IEEE Communications Surveys and Tutorials, 11(3):37–52, 2009. [6] Joseph Kampeas, Asaf Cohen, and Omer Gurewitz. Traffic classification based on zerolength packets. IEEE Transactions on Network and Service Management, 2018. [7] tshark manual. [Online]. Avaliable: https://www.wireshark.org/docs/man-pages/tshark.html. [8] sklearn.onehotencoder. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html. [9] scikit-learn. [Online]. Avaliable: http://scikit-learn.org/stable/. [10] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl. Algorithms for hyper-parameter optimization. In International Conference on Neural Information Processing Systems (NIPS), 2011. [11] Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 6(1):3, 2012. [12] sklearn.oneclasssvm. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html. [13] sklearn.gaussianmixture. [Online]. Avaliable: https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html. [14] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001. [15] Trevor Hastie, Saharon Rosset, Ji Zhu, and Hui Zou. Multi-class adaboost. Statistics and its Interface, 2(3):349–360, 2009. [16] Thomas Cover and Peter Hart. Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21–27, 1967. [17] Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7:21, 2013. [18] C. Nitesh. Data Mining for Imbalanced Datasets: An Overview. Nitesh V. Chawla, 2005. [19] Leo Breiman. Classification and regression trees. Routledge, 2017. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21455 | - |
| dc.description.abstract | 網路流量的識別對於網路安全以及管理非常重要。這個技術可以幫助網路管理者找出網路中的異常。許多大型網站已經成為網路攻擊的目標,而當他們遭受到攻擊時,對外流出的流量行為可能會產生異常。在先前的文獻當中,網路流量識別使用到載荷檢查(payload inspection)以及基於同一條流量的特徵(flow-based features)辨識。然而,當封包是加密的時候,載荷檢查無法發揮它的作用,並且檢查載荷時會有隱私問題。再者,在訂定這些檢查規則時造成的負擔很重,也需要定期的更新。再基於同一條流量的特徵辨識時,需要蒐集多個封包來計算統計特徵,運算複雜度相當大,而且耗時。因此,我們提出一個只使用封包表頭來辨認封包是否正常的機制。由於只需要一個封包的表頭資訊來做辨認,不僅降低計算特徵時的負擔,也可降低封包辨認所需要的時間。 | zh_TW |
| dc.description.abstract | Network traffic identification technique plays an important role in modern network security and management architectures. It can help the network manager to find out the anomalies in the network. Some large scale servers have become the targets of security attacks, and the behaviors of the servers may become abnormal. In previous works, network identification requires payload inspection and flow-based features collecting. Payload inspection cannot function when the payload is encrypted and may incur the privacy issue because the content should be scanned. Furthermore, it usually imposes heavy overhead to construct the rules. In the flow-based feature identification technique, the computation overhead to collect multiple packets and calculate the statistic features is very high and time-consuming. Therefore, we proposed a mechanism to determine whether the Internet traffics from large scale servers are normal or not by using the packet header information. Furthermore, because it only requires the information of one packet for detection, the detection time and the computation overhead can be reduced. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T03:34:36Z (GMT). No. of bitstreams: 1 ntu-108-R06942053-1.pdf: 2824845 bytes, checksum: 2eb19ed95ef7f7d30139c68c747deb18 (MD5) Previous issue date: 2019 | en |
| dc.description.tableofcontents | Contents
致謝i 中文摘要ii Abstract iii Contents iv List of Figures vi List of Tables viii 1 Introduction 1 2 Internet Environment and Data Format 4 3 Traffic Identification 10 3.1 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.2 Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Identification Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.2 Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Experiments 19 4.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Single Server IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.1 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Multiple Server IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3.1 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.4 Feature Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5 Conclusion 37 Appendix A Experiment Results (raw data) 38 Appendix B Parameter Setting 47 Bibliograghy 51 | |
| dc.language.iso | en | |
| dc.title | 利用封包表頭識別來自於大型網站伺服器的封包流 | zh_TW |
| dc.title | Identifying Traffic of Large Scale Web Server Using Packet
Header Information | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 107-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 林風(Phone Lin) | |
| dc.contributor.oralexamcommittee | 林一平(Yi-Bing Lin),廖婉君(Wan-jiun Liao) | |
| dc.subject.keyword | 機器學習,封包表頭資訊,網路流量辨識,台大網路校園流量, | zh_TW |
| dc.subject.keyword | machine learning,packet header information,internet traffic identification,NTU campus traffic, | en |
| dc.relation.page | 53 | |
| dc.identifier.doi | 10.6342/NTU201901028 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2019-08-05 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-108-1.pdf 未授權公開取用 | 2.76 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
