請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47533
標題: | 使用多目標演化式演算法進行資料探勘中的法則萃取 Multiobjective Evolutionary Algorithm for Rule Extraction in Data Mining |
作者: | Yung-Hsiang Chan 詹詠翔 |
指導教授: | 傅立成(Li-Chen Fu) |
共同指導教授: | 蔣宗哲(Tsung-Che Chiang) |
關鍵字: | 多目標演化式演算法,資料探勘,數值關聯法則探勘,分類法則探勘, Multiobjective Evolutionary Algorithm,Data Mining,Numeric Association Rule Mining,Classification Rule Mining, |
出版年 : | 2010 |
學位: | 碩士 |
摘要: | 這篇論文研究如何解決資料探勘中法則萃取的問題,其中包括了數值關聯法則探勘 (numeric association rule mining)以及分類法則探勘 (classification rule mnining)。這兩類的問題存在著多個目標需要同時被最佳化,而這些目標時常互相抵觸。我們提出了兩個多目標演化式演算法來分別解決這兩種問題。我們採納了MOEA/D中透過布置均勻的權重向量來達成配對選擇 (mating selection)和物競天擇 (environmental selection)的概念來維持探索及開發間的平衡。為了保留那些具有相同適應度 (fitness)但卻不相同的解,MOEA/D中對於子問題的解從限制一個解被修改成可以是一個解集合。我們遵循一般關聯法則探勘的架構,透過尋找頻繁項目集 (frequent itemset)來進行數值關聯法則探勘。而對於分類法則探勘,我們提出了一個結合密西根 (Michigan)和匹茲堡 (Pittsburgh)兩種方法的兩階段演化式演算法。透過第一階段先找出所有柏拉圖最佳 (Pareto-optimal)法則,在第二階段則利用這些法則組合成柏拉圖最佳法則集合。當法則互相起衝突時,每個法則集合會根據各自的喜好選擇對應方針。我們提出的數值關聯法則探勘演算法透過實驗在人造的資料集中可顯示出他的正確性和有效性。我們也把這個方法用在一些公開的實際生活上產生的資料集上,可當成外來比較的依據。對於分類法則探勘,我們在一些公開的資料集上和一些現存基於法則 (rule-based)或是非基於法則 (non-rule based)的分類器進行比較,實驗結果顯示我們的方法是有效的。 In this thesis, the problem of rule extraction in data mining including numeric association rule mining and classification rule mining is addressed. Both tasks involve many objectives to be optimized simultaneously, where the objectives frequently contradict with each other. Two Pareto-based multiobjective evolutionary algorithms are proposed to solve these problems. By incorporating the concept of MOEA/D, the mating restriction and environmental selection enhance the exploitation and exportation ability through setting the uniform weight vectors. And the solution of subproblem defined in MOEA/D is modified to a set of solutions to obtain solutions with same fitness. For numerical association rule mining, the proposed algorithm follows the common framework to obtain frequent itemsets. For classification, a two-phase multiobjective evolutionary algorithm is proposed which combines both Michigan and Pittsburgh approach to find Pareto-optimal rules first and then to form the Pareto-optimal rule set. The policy for each rule set is different according to its preference when conflict between rules occurred. Through experiments upon synthetic datasets, the proposed algorithm for numeric association rule mining shows its correctness and efficiency. The proposed algorithm is also applied upon several public real life datasets for future comparison. And for classification, the experimental results show it’s competitive against existing rule-based and non-rule based classifiers upon several public datasets. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47533 |
全文授權: | 有償授權 |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-99-1.pdf 目前未授權公開取用 | 647.03 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。