物聯網入侵偵測研究中跨網路資料集之類別不平衡處理

劉育廷; Yu-Ting Liu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93864

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	謝宏昀	zh_TW
dc.contributor.advisor	Hung-Yun Hsieh	en
dc.contributor.author	劉育廷	zh_TW
dc.contributor.author	Yu-Ting Liu	en
dc.date.accessioned	2024-08-08T16:38:08Z	-
dc.date.available	2024-08-09	-
dc.date.copyright	2024-08-08	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-05	-
dc.identifier.citation	[1] N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set),” in Military Communications and Information Systems Conference (MilCIS). IEEE, 2015. [2] N. Koroniotis, N. Moustafa, E. Sitnikova, and B. P. Turnbull, “Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-IoT dataset,” Future Gener. Comput. Syst., vol. 100, pp. 779–796, 2018. Online Available at: https://api.semanticscholar.org/CorpusID:53297457 [3] A. Petrosyan. (May. 2023) Global annual number of IoT cyber attacks 2018- 2022. Online Available at: https://www.statista.com/statistics/1377569/ worldwide-annual-internet-of-things-attacks/ [4] L. S. Vailshery. (Jul. 2023) Number of IoT connected devices worldwide 2019-2023, with forecasts to 2030. Online Available at: https://www. statista.com/statistics/1183457/iot-connected-devices-worldwide/ [5] B. E. Strom, A. Applebaum, D. P. Miller, K. C. Nickels, A. G. Pennington, and C. B. Thomas, “Mitre att&ck: Design and philosophy,” in Technical report. The MITRE Corporation, 2018. [6] E. M. Hutchins, M. J. Cloppert, and R. M. Amin, “Intelligencedriven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,” 2010. Online Available at: https://api.semanticscholar.org/CorpusID:6421896 [7] A. Kummer, T. Ruppert, T. Medvegy, and J. Abonyi, “Machine learning-based software sensors for machine state monitoring - the role of SMOTE-based data augmentation,” Results in Engineering, vol. 16, p. 100778, 2022. Online Available at: https://www.sciencedirect.com/science/ article/pii/S2590123022004480 [8] M. H. A. Banna, T. Ghosh, K. A. Taher, M. S. Kaiser, and M. Mahmud, “An earthquake prediction system for bangladesh using deep long short-term memory architecture,” 2021. Online Available at: https://api.semanticscholar.org/CorpusID:234918236 [9] Z. Zoghi and G. Serpen, “UNSW-NB15 computer security dataset: Analysis through visualization,” 2021. Online Available at: https: //arxiv.org/abs/2101.05067 [10] J. Atuhurra, T. Hara, Y. Zhang, M. Sasabe, and S. Kasahara, “Dealing with imbalanced classes in Bot-IoT dataset,” 2024. Online Available at: https://arxiv.org/abs/2403.18989 [11] M. Mbow, H. Koide, and K. Sakurai, “An intrusion detection system for imbalanced dataset based on deep learning,” in 2021 Ninth International Symposium on Computing and Networking (CANDAR), 2021, pp. 38–47. [12] S. Sapre, K. Islam, and P. Ahmadi, “A comprehensive data sampling analysis applied to the classification of rare iot network intrusion types,” in 2021 IEEE 18th Annual Consumer Communications Networking Conference (CCNC), 2021, pp. 1–2. [13] A. Abdullah ALFRHAN, R. Hamad ALHUSAIN, and R. Ulah Khan, “SMOTE: Class imbalance problem in intrusion detection system,” in 2020 International Conference on Computing and Information Technology (ICCIT1441), 2020, pp. 1–5. [14] A. Shameli-Sendi, M. Cheriet, and A. Hamou-Lhadj, “Taxonomy of intrusion risk assessment and response system,” Computers Security, vol. 45, pp. 1–16, 2014. Online Available at: https://www.sciencedirect.com/science/ article/pii/S0167404814000613 [15] D. Moon, S. B. Pan, and I. Kim, “Host-based intrusion detection system for secure human-centric computing,” The Journal of Supercomputing, vol. 72, pp. 2520 – 2536, 2015. Online Available at: https://api.semanticscholar.org/ CorpusID:11127284 [16] H. A. Kholidy and F. Baiardi, “CIDS: A framework for intrusion detection in cloud systems,” 2012 Ninth International Conference on Information Technology - New Generations, pp. 379–385, 2012. Online Available at: https://api.semanticscholar.org/CorpusID:13077762 [17] I. Corona, G. Giacinto, and F. Roli, “Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues,” Inf. Sci., vol. 239, pp. 201–225, 2013. Online Available at: https: //api.semanticscholar.org/CorpusID:17236745 [18] J. Peng, K.-K. R. Choo, and H. Ashman, “User profiling in intrusion detection: A review,” J. Netw. Comput. Appl., vol. 72, pp. 14–27, 2016. Online Available at: https://api.semanticscholar.org/CorpusID:46238422 [19] K. Saurabh, S. Sood, P. A. Kumar, U. Singh, R. Vyas, O. P. Vyas, and R. Khondoker, “LBDMIDS: LSTM based deep learning model for intrusion detection systems for iot networks,” 2022. Online Available at: https://arxiv.org/abs/2207.00424 [20] S. Rose, O. Borchert, S. Mitchell, and S. Connelly, “Zero trust architecture,” 2019. Online Available at: https://api.semanticscholar.org/CorpusID: 212765583 [21] M. Hooda, J. Babu, P. S. Vamsi, and G. Gopakumar, “An improved intrusion detection system based on KDD dataset using feature ranking and data sampling,” in 2020 International Conference on Communication and Signal Processing (ICCSP), 2020, pp. 1128–1132. [22] A. Newaz, S. Hassan, and F. S. Haq, “An empirical analysis of the efficacy of different sampling techniques for imbalanced classification,” 2022. Online Available at: https://arxiv.org/abs/2208.11852 [23] M. Khushi, K. Shaukat, T. M. Alam, I. A. Hameed, S. Uddin, S. Luo, X. Yang, and M. C. Reyes, “A comparative performance analysis of data resampling methods on imbalance medical data,” IEEE Access, vol. 9, pp. 109 960–109 975, 2021. [24] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002. [25] H. Han, W. Wang, and B. Mao, “Borderline-SMOTE: A new oversampling method in imbalanced data sets learning,” in International Conference on Intelligent Computing, 2005. Online Available at: https: //api.semanticscholar.org/CorpusID:12126950 [26] G. E. A. P. A. Batista, A. L. C. Bazzan, and M. C. Monard, “Balancing training data for automated annotation of keywords: a case study,” in WOB, 2003. Online Available at: https://api.semanticscholar.org/CorpusID: 1579194 [27] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” SIGKDD Explor., vol. 6, pp. 20–29, 2004. Online Available at: https://api.semanticscholar.org/CorpusID:207155015 [28] H. Liu, Y. Zhang, B. Deng, and Y. Fu, “Outlier detection via sampling ensemble,” in 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 726–735. [29] B. Nepal, M. Yamaha, H. Sahashi, and A. Yokoe, “Analysis of building electricity use pattern using k-means clustering algorithm by determination of better initial centroids and number of clusters,” Energies, 2019. Online Available at: https://api.semanticscholar.org/CorpusID:197412690 [30] M. Breunig, P. Kr¨oger, R. Ng, and J. Sander, “LOF: Identifying density-based local outliers.” vol. 29, 06 2000, pp. 93–104. [31] A. Kaur, K. Guleria, and N. Kumar Trivedi, “Feature selection in machine learning: Methods and comparison,” in 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 2021, pp. 789–795. [32] Kurniabudi, D. Stiawan, Darmawijoyo, M. Y. Bin Idris, A. M. Bamhdi, and R. Budiarto, “CICIDS-2017 dataset feature analysis with information gain for anomaly detection,” IEEE Access, vol. 8, pp. 132 911–132 921, 2020. [33] N. Sai, B. Chandana, S. P. Praveen, S. S. Kumar, and M. J. kumar, “Improving performance of IDS by using feature selection with IG-R,” in 2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2021, pp. 1–8. [34] W. Wang, X. Du, and N. Wang, “Building a cloud IDS using an efficient feature selection method and SVM,” IEEE Access, vol. 7, pp. 1345–1354, 2019. [35] M. Mirlashari and S. A. M. Rizvi, “Feature selection technique-based network intrusion system using machine learning,” in 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), 2022, pp. 905–908. [36] N. S. Yadav, V. P. Sharma, D. S. D. Reddy, and S. Mishra, “An effective network intrusion detection system using recursive feature elimination technique,” Engineering Proceedings, vol. 59, no. 1, 2023. Online Available at: https://www.mdpi.com/2673-4591/59/1/99 [37] Y. Hua, “An efficient traffic classification scheme using embedded feature selection and LightGBM,” in 2020 Information Communication Technologies Conference (ICTC), 2020, pp. 125–130. [38] Y. Yin, J. Jang-Jaccard, W. Xu, A. Singh, J. Zhu, F. Sabrina, and J. Kwak, “IGRF-RFE: A hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset,” p. 15, 03 2022. [39] H. Hindy, D. Brosset, E. Bayne, A. K. Seeam, C. Tachtatzis, R. Atkinson, and X. Bellekens, “A taxonomy of network threats and the effect of current datasets on intrusion detection systems,” IEEE Access, vol. 8, pp. 104 650– 104 675, 2020. [40] S. B. Kotsiantis, D. Kanellopoulos, and P. E. Pintelas, “Data preprocessing for supervised leaning,” International journal of computer science, vol. 1, no. 2, pp. 111–117, 2006. [41] E.-H. Huang, H.-W. Hu, W.-L. Jheng, K.-Y. Chen, C.-H. Liu, H.-Y. Chi, T.-W. Chang, W. Chin-Yu, C.-H. Un, H.-M. Lin, C.-W. Chen, and J.-F. Wang, “Feature selection for intradialytic blood pressure value prediction using GRU-based method under RFECV algorithm,” in 2021 9th International Conference on Orange Technology (ICOT), 2021, pp. 1–4. [42] M. Awad and S. Fraihat, “Recursive feature elimination with crossvalidation with decision tree: Feature selection method for machine learning-based intrusion detection systems,” Journal of Sensor and Actuator Networks, vol. 12, no. 5, 2023. Online Available at: https: //www.mdpi.com/2224-2708/12/5/67 [43] W. Lian, G. Nie, B. Jia, D. Shi, Q. Fan, and Y. Liang, “An intrusion detection method based on decision tree-recursive feature elimination in ensemble learning,” Mathematical Problems in Engineering, vol. 2020, pp. 1–15, 2020. Online Available at: https://api.semanticscholar.org/CorpusID:229501500 [44] A. Z. Mustaqim, S. Adi, Y. Pristyanto, and Y. Astuti, “The effect of recursive feature elimination with cross-validation (RFECV) feature selection algorithm toward classifier performance on credit card fraud detection,” in 2021 International Conference on Artificial Intelligence and Computer Science Technology (ICAICST), 2021, pp. 270–275. [45] S. Althubiti, W. Nick, J. Mason, X. Yuan, and A. Esterline, “Applying long short-term memory recurrent neural network for intrusion detection,” in SoutheastCon 2018, 2018, pp. 1–5. [46] H. R. Sayegh, W. Dong, and A. M. Al-madani, “Enhanced intrusion detection with LSTM-based model, feature selection, and SMOTE for imbalanced data,” Applied Sciences, vol. 14, no. 2, 2024. Online Available at: https://www.mdpi.com/2076-3417/14/2/479 [47] A. G. Putrada, N. Alamsyah, S. F. Pane, and M. N. Fauzan, “Xgboost for IDS on WSN cyber attacks with imbalanced data,” in 2022 International Symposium on Electronics and Smart Devices (ISESD), 2022, pp. 1–7. [48] M. Zolanvari, M. A. Teixeira, and R. Jain, “Effect of imbalanced datasets on security of industrial iot using machine learning,” in 2018 IEEE International Conference on Intelligence and Security Informatics (ISI), 2018, pp. 112–117. [49] J. L. Leevy, T. M. Khoshgoftaar, and J. M. Peterson, “Mitigating class imbalance for IoT network intrusion detection: A survey,” in 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), 2021, pp. 143–148. [50] J. L. Leevy, J. Hancock, T. M. Khoshgoftaar, and N. Seliya, “IoT reconnaissance attack classification with random undersampling and ensemble feature selection,” in 2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC), 2021, pp. 41–49. [51] J. L. Leevy, T. M. Khoshgoftaar, R. A. Bauder, and N. Seliya, “A survey on addressing high-class imbalance in big data,” Journal of Big Data, vol. 5, 2018. Online Available at: https://api.semanticscholar.org/CorpusID:256397352 [52] K. Potdar, T. S. Pardawala, and C. D. Pai, “A comparative study of categorical variable encoding techniques for neural network classifiers,” International Journal of Computer Applications, vol. 175, pp. 7–9, 2017. Online Available at: https://api.semanticscholar.org/CorpusID:67714691 [53] M. Bakro, R. R. Kumar, A. Alabrah, Z. Ashraf, M. N. Ahmed, M. Shameem, and A. Abdelsalam, “An improved design for a cloud intrusion detection system using hybrid features selection approach with ML classifier,” IEEE Access, vol. 11, pp. 64 228–64 247, 2023. [54] J. Li, H. Chen, M. O. Shahizan, and L. M. Yusuf, “Enhancing IoT security: A comparative study of feature reduction techniques for intrusion detection system,” Intelligent Systems with Applications, vol. 23, p. 200407, 2024. Online Available at: https://www.sciencedirect.com/science/article/ pii/S2667305324000814 [55] M. Ahmad, Q. Riaz, M. Zeeshan, H. Tahir, S. A. Haider, and M. S. Khan, “Intrusion detection in internet of things using supervised machine learning based on application and transport layer features using unsw-nb15 data-set,” EURASIP J. Wirel. Commun. Netw., vol. 2021, no. 1, jan 2021. Online Available at: https://doi.org/10.1186/s13638-021-01893-8 [56] R. Bar-Yanai, M. Langberg, D. Peleg, and L. Roditty, “Realtime classification for encrypted traffic,” in The Sea, 2010. Online Available at: https://api.semanticscholar.org/CorpusID:39614129 [57] A. Gaddam, T. Wilkin, and M. Angelova, “Anomaly detection models for detecting sensor faults and outliers in the IoT - a survey,” in 2019 13th International Conference on Sensing Technology (ICST), 2019, pp. 1–6. [58] A. Wahid and A. Chandra Sekhara Rao, “An outlier detection algorithm based on KNN-kernel density estimation,” in 2020 International Joint Conference on Neural Networks (IJCNN), 2020, pp. 1–8. [59] F. Angiulli, S. Basta, and C. Pizzuti, “Distance-based detection and prediction of outliers,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 2, pp. 145–160, 2006. [60] Asniar, N. U. Maulidevi, and K. Surendro, “SMOTE-LOF for noise identification in imbalanced data classification,” J. King Saud Univ. Comput. Inf. Sci., vol. 34, pp. 3413–3423, 2021. Online Available at: https://api.semanticscholar.org/CorpusID:233960473 [61] P. Y. Prasad, A. S. Chowdary, C. Bavitha, E. Mounisha, and C. Reethika, “A comparison study of fraud detection in usage of credit cards using machine learning,” in 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), 2023, pp. 1204–1209. [62] P. Gupta, A. Varshney, M. Khan, R. Ahmed, M. Shuaib, and S. Alam, “Unbalanced credit card fraud detection data: A machine learning-oriented comparative study of balancing techniques,” Procedia Computer Science, vol. 218, pp. 2575–2584, 01 2023. [63] M. Bekkar, H. K. Djemaa, and T. A. Alitouche, “Evaluation measures for models assessment over imbalanced data sets,” Journal of Information Engineering and Applications, vol. 3, pp. 27–38, 2013. Online Available at: https://api.semanticscholar.org/CorpusID:52267786 [64] L. Tao, H. Zhu, Q. Wang, Y. Liang, and X. Deng, “A combined priori and purity gaussian oversampling algorithm for imbalanced data classification,” IEEE Access, vol. PP, pp. 1–1, 01 2023. [65] N. Guizani and A. Ghafoor, “A network function virtualization system for detecting malware in large IoT based networks,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 6, pp. 1218–1228, 2020. [66] O. Ibitoye, O. Shafiq, and A. Matrawy, “Analyzing adversarial attacks against deep learning for intrusion detection in IoT networks,” in 2019 IEEE Global Communications Conference (GLOBECOM), 2019, pp. 1–6. [67] Y. Yang, K. Zheng, C. Wu, and Y. Yang, “Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network,” Sensors (Basel, Switzerland), vol. 19, 2019. Online Available at: https://api.semanticscholar.org/CorpusID: 174807441 [68] M. Zeeshan, Q. Riaz, M. A. Bilal, M. K. Shahzad, H. Jabeen, S. A. Haider, and A. Rahim, “Protocol-based deep intrusion detection for DoS and DDoS attacks using UNSW-NB15 and Bot-IoT data-sets,” IEEE Access, vol. 10, pp. 2269–2283, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93864	-
dc.description.abstract	隨著物聯網（IoT）中的網路攻擊增加，入侵偵測系統（IDS）的穩健性變得特別重要。然而，大多數現有的研究並未考慮內部和外部網路環境之間的行為差異，且現有的入侵偵測相關研究也很少有對偵察型攻擊進行探討，但偵察型攻擊卻是網路攻擊生命週期中的關鍵階段。相較之下，一般研究通常更專注於較為常見的攻擊，如阻斷服務（DoS）和分散式阻斷服務（DDoS）攻擊。一個完善的IDS應能夠適應不同的環境，最大限度的減少攻擊的影響，甚至防止攻擊的發生。基於上述內容，我們結合了外部網路資料集UNSW-NB15和內部網路資料集Bot-IoT，創建了一個能夠確切反映現實情境的物聯網網路資料集，並緩解了入侵偵測資料集中很常見的資料不平衡問題，因為在這些資料集的收集期間，攻擊行為發生的頻率通常很低，從而導致類別不平衡的問題。為了進一步解決偵察型攻擊數據缺乏的問題，我們提出了KLB-SMOTE，一種結合異常值移除和過採樣技術的資料採樣方法。我們對少數類別樣本進行分類，對雜訊群組應用基於距離和密度的異常值檢測，並專注於在邊界上進行少數類別樣本的資料合成。經KLB-SMOTE產生的合成樣本更準確的反映了資料分佈及實現了類別平衡，同時也提高了模型的有效性。最後，我們利用深度學習技術來對DoS、DDoS和偵察型攻擊進行多元分類。這一方法使偵察型攻擊的準確率提高了約45%，每個類別的準確率均超過95.9%，整體準確率達到97.6%。	zh_TW
dc.description.abstract	The rise in IoT cyber-attacks highlights the need for robust intrusion detection systems (IDS). However, most existing research does not consider the differences in network behavior between internal and external environments, and the existing intrusion detection works rarely perform experiments on reconnaissance attacks, which are a crucial phase in the cyber attack lifecycle, instead focusing on more common attacks like Denial of Service (DoS) and Distributed DoS (DDoS) attacks. A robust IDS should be capable of adapting to different environments, minimizing the impact of attacks, and even preventing them from occurring. Considering the content above, we combined the external network dataset UNSW-NB15 and the internal network dataset Bot-IoT to create an IoT network dataset that closely reflects real-world scenarios and mitigates data imbalance issues, which are common in intrusion detection datasets because attacks typically occur at low frequency. Among them, to address the lack of reconnaissance attack data, we further proposed KLB-SMOTE, a data sampling method integrating outlier removal and oversampling techniques. We categorize minority class samples, apply distance-based and density-based outlier detection to noise group, and focus on synthesizing data specifically within the boundaries of minority class data. KLB-SMOTE generates synthetic samples that more accurately reflect the data distribution, achieving class balance while enhancing model effectiveness. Finally, leveraging deep learning techniques, we conducted a multi-class classification of various attack types, including DoS, DDoS, and Reconnaissance. This approach improved reconnaissance prediction accuracy by approximately 45%, with each class achieving a prediction accuracy of over 95.9% and an overall accuracy score of 97.6%.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-08T16:38:08Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-08-08T16:38:08Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	致謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 2 BACKGROUND AND RELATED WORK . . . . . 5 2.1 Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Cyber Security Frameworks . . . . . . . . . . . . . . . . . . . . . . 6 2.2.1 The MITRE ATT&CK Framework . . . . . . . . . . . . . 6 2.2.2 The Cyber Kill Chain . . . . . . . . . . . . . . . . . . . . . 7 2.2.3 Zero Trust Architecture (ZTA) . . . . . . . . . . . . . . . . 9 2.3 Data Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Undersampling Techniques . . . . . . . . . . . . . . . . . . 9 2.3.2 Oversampling Techniques . . . . . . . . . . . . . . . . . . . 10 2.3.3 Hybrid sampling Techniques . . . . . . . . . . . . . . . . . 11 2.4 Outlier Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.1 K-Means Clustering . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2 Local Outlier Factor (LOF) . . . . . . . . . . . . . . . . . . 13 2.5 Feature Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 14 2.5.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . 15 CHAPTER 3 INTRUSION DETECTION SYSTEM . . . . . . . . 17 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 Handling Imbalanced data . . . . . . . . . . . . . . . . . . 19 3.2.2 Data Normalization . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3.1 Information Gain (IG) . . . . . . . . . . . . . . . . . . . . 20 3.3.2 Pearson Correlation Coefficient (PCC) . . . . . . . . . . . . 21 3.3.3 Light Gradient Boosting Machine (LightGBM) . . . . . . . 21 3.3.4 Recursive Feature Elimination (RFE) . . . . . . . . . . . . 24 3.3.5 Hybrid Feature Selection . . . . . . . . . . . . . . . . . . . 24 3.4 Long Short-Term Memory (LSTM) . . . . . . . . . . . . . . . . . 27 CHAPTER 4 PROPOSED METHOD . . . . . . . . . . . . . . . . . 31 4.1 DataSets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.1 UNSW-NB15 . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.2 Bot-IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Merging Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 Data Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2.2 Data Cleaning and Transformation . . . . . . . . . . . . . . 35 4.2.3 Datasets Comparison . . . . . . . . . . . . . . . . . . . . . 37 4.2.4 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 39 4.3 KLB-SMOTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3.1 Outlier Removal . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3.2 Synthetic Data . . . . . . . . . . . . . . . . . . . . . . . . . 42 CHAPTER 5 PERFORMANCE EVALUATION . . . . . . . . . . 44 5.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.2.1 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.2.2 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . 45 5.2.3 F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.3 Intrusion Detection Experiment . . . . . . . . . . . . . . . . . . . 46 5.3.1 Custom Dataset Result . . . . . . . . . . . . . . . . . . . . 47 5.3.2 Undersampling Result . . . . . . . . . . . . . . . . . . . . . 49 5.3.3 Oversampling Result . . . . . . . . . . . . . . . . . . . . . 50 5.3.4 Hybrid sampling Result . . . . . . . . . . . . . . . . . . . . 54 5.3.5 KLB-SMOTE Result . . . . . . . . . . . . . . . . . . . . . 56 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 CHAPTER 6 CONCLUSION AND FUTURE WORK . . . . . . 59 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61	-
dc.language.iso	en	-
dc.subject	物聯網	zh_TW
dc.subject	入侵偵測	zh_TW
dc.subject	不平衡資料集	zh_TW
dc.subject	資料採樣	zh_TW
dc.subject	網路安全	zh_TW
dc.subject	Imbalanced Dataset	en
dc.subject	IoT	en
dc.subject	Cyber Security	en
dc.subject	Data Sampling	en
dc.subject	Intrusion Detection	en
dc.title	物聯網入侵偵測研究中跨網路資料集之類別不平衡處理	zh_TW
dc.title	Handling Class Imbalance of Cross-Network Datasets for Intrusion Detection in IoT Networks	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	高榮鴻;沈上翔	zh_TW
dc.contributor.oralexamcommittee	Rung-Hung Gau;Shan-Hsiang Shen	en
dc.subject.keyword	物聯網,入侵偵測,不平衡資料集,資料採樣,網路安全,	zh_TW
dc.subject.keyword	IoT,Intrusion Detection,Imbalanced Dataset,Data Sampling,Cyber Security,	en
dc.relation.page	67	-
dc.identifier.doi	10.6342/NTU202402789	-
dc.rights.note	未授權	-
dc.date.accepted	2024-08-07	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 未授權公開取用	7.94 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。