適用於巨量資料分析的約略集合規則歸納法

Yu-Neng Fan; 范有寧

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15636

標題:	適用於巨量資料分析的約略集合規則歸納法 A Novel Rough Set-based Rule Induction for Big Data Analytics
作者:	Yu-Neng Fan 范有寧
指導教授:	陳靜枝(Ching-Chin Chern)
關鍵字:	約略集合,規則歸納法,增量式演算法,巨量資料,資料探勘, Rough Set Theory,Rule Induction,Incremental Algorithm,Big Data,Data Mining,
出版年 :	2013
學位:	博士
摘要:	約略集合規則歸納法是一種適用於處理不確定且不完整數據的科學方法，可透過對數據的分析和推理來發現隱含的知識、揭示潛在的規則，且不需要額外的統計的假設。此種分析工具於近年來受到許多矚目，並也廣泛且成功運用於許多領域中。然而，近年來企業及實務界皆面臨巨量資料所帶來的衝擊，當系統建置於處理營運所產生的交易數據及資料時，資料會於短暫時間內大量增加並累積，其增加的量及速率都超出現有分析工具所能處理的範圍。此外，以資料集維度來觀察，我們發現於資料集中，並非只有物件會在短期間內大量增加，屬性維度亦有相同趨勢。為因應此趨勢，本研究提出一適用於巨量資料分析的增量式約略集合規則歸納法，此模型考量資料集中物件增加及屬性增加兩種維度的議題。可有效運用增量式演算法的特性，有效率的更新規則且節省大量計算時間。本研究以台灣知名的電視購物台資料為例，實行的結果顯示，本研究所提出的增量式規則歸納法能於短時間內因應新增資料有效更新規則，其效率及分類的正確率及覆蓋率都較傳統方法優異。此結果說明增量式規則歸納法可作為企業處理巨量資料分析時的解決方案，其所產生的規則更可作為企業決策支援及策略評估的重要指標。 Rough set-based rule induction is able to generate decision rules from a database and has mechanisms to handle noise and uncertainty in data. Using these meaningful decision rules, the technique facilitates managerial decision-making. However, databases are used to run the day-to-day operations of a business must process quickly. Large volumes of data are continually updatedwithin a short period of time. The infrastructure required to analyze such large amounts of data must be able to support a deeper analysis, to deal with extreme data volumes, to allow faster response times, and to automate decisions based on analytical models. This study proposed a rough set-based rule induction approach with consideration of both incremental objects and attributes. It is able to deal with the big data issue for rule induction while the data are incrementally added into the dataset. The method eliminates the necessity to re-compute the entire dataset when the database is updated. As a result, huge amounts of computation time and memory space are saved. The proposed model is composed of five main steps: case determination, reduct generation, significance calculation, rule induction, and rule tuning. A case study of a Home shopping company is used to show the validity and efficiency of this method. The results show that the proposed model considerably reduces the computing time for inducing decision rules, while maintaining the quality of the rules.Since this subject has rarely been the subject of previous study, it is believed that this study will form the basis for the solution of many other similar problems of big data analytics.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15636
全文授權:	未授權
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 目前未授權公開取用	1.18 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。