請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72124
標題: | 機器學習與深度學習演算法於電信業務支援系統不平衡資料之效能評估 Performance Evaluation for Machine Learning and Deep Learning Algorithms on Imbalanced Dataset: Case Study of Business Support System |
作者: | Yen-Cheng Lin 林彥呈 |
指導教授: | 林風 |
關鍵字: | 異常預測,錯誤預測,稀有事件,不平衡資料,機器學習,深度學習,神經網路, anomaly prediction,failure prediction,rare events,rare events,imbalanced dataset,machine learning,deep learning,neural network, |
出版年 : | 2018 |
學位: | 碩士 |
摘要: | 從真實世界中收集的資料通常都是分佈不平衡(imbalanced)的,資料中各個類別的分布極度不平均。傳統的分類演算法會過度的傾向學習多數的類別(通常是較不重要的類別)。在這篇論文中,我們將以電信業務支援系統的異常預測為例,測試機器學習(machine learning)及深度學習(deep learning)演算法在不平衡資料上的效能。電信業務支援系統通常都維持著良好的穩定度,所以系統異常是稀有事件(rare events),因此機器學習及深度學習演算法更難在這個極度不平衡(highly imbalanced)的資料上達到良好的效果。為了解決這個問題,我們提出了基於頻率的特徵(Frequency-based Feature Creation),藉由產生新的特徵用來描述獨熱編碼(one hot encoded)特徵的分佈。除此之外,我們也修改了現有的技術用以增強少數類別的影響力,例如門檻投票(Voting with Threshold)及分類修正(Classification Correction)。 The data collected from the real systems is imbalanced, i.e. The classification categories are not equally represented. The existing classification algorithms usually introduce bias towards majority class (potentially uninteresting class). In this thesis, we will apply the anomaly prediction on a Business Support System (BSS) [1] of telecommunication service providers as a case to study the performance of the machine learning [2, 3, 4, 5] and deep learning [2, 3, 4] algorithms on imbalanced dataset. The reliability and stability have been treated as the major requirements for a BSS [6]. In other words, the occurrences of anomaly are rare events in a BSS. The distribution of the system log data of BSS is highly imbalanced. Thus, it is more challenging for machine learning algorithms and deep learning algorithms to have good performance on highly imbalanced datasets. To resolve the issue, we propose an approach, namely Frequency-based Feature Creation (FFC), to create new features to describe the distributions of the one-hot-encoded features. Furthermore, we enhance some existing techniques to amplify the effects of the minority class, e.g., Voting with Threshold (VT) and Classification Correction (CC). |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72124 |
DOI: | 10.6342/NTU201801994 |
全文授權: | 有償授權 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-1.pdf 目前未授權公開取用 | 4.16 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。