Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工業工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23021
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor陳正剛
dc.contributor.authorWei-Ting Yangen
dc.contributor.author楊惟婷zh_TW
dc.date.accessioned2021-06-08T04:38:08Z-
dc.date.copyright2009-08-21
dc.date.issued2009
dc.date.submitted2009-08-15
dc.identifier.citation[1]. Alpaydin,E., “Combined 5 x 2cv F test for comparing supervised classification learning algorithms,” Neural Comput., vol. 11, pp. 1975–1982, 1999.
[2]. Bartlett, M. S., “Multivariate analysis”. J. Roy. Stat. Soc. (Supple.), 2, 176-197, 1947.
[3]. Breiman, L., Friedman J. H., Olshen, R. A., and Stone, C. J., “Classification and Regression Trees,” Belmont, CA: Wadsworth, 1984.
[4]. Breiman, L., “Technical note: some properties of splitting criteria,” Machine Learning, v.24 n.1, p.41-47, July 1996.
[5]. Brodley, Carla E. and Paul E. Utgoff. “Multivariate decision trees,” Machine Learning, 19:45–77, 1995.
[6]. Fisher, R. A., “The use of multiple measurement in taxonomic problems,” Annals of Eugenics, 1936. 7: p. 178-188.
[7]. Gama,J , “Oblique Linear Tree” , in Advances in Intelligent Data Analysis — Reasoning about Data', Ed. X.Liu, P.Cohen, M.Berthold, Springer Verlag LNCS ,1997.
[8]. Jobson, J. D., “Applied Multivariate Data Analysis: Categorical and multivariate methods,” 1992.
[9]. Loh, W. Y., “Tree-structured Classification via Generalized Discriminant Analysis”, Journal of the American Statistical Association, 83, 715-728.
[10]. Mahalanobis, P.C. , “On the generalised distance in statistics,” Proceedings of the National Institute of Science of India 12, pp. 49–55.,1936.
[11]. Murthy, S. K., Kasif, S., and Salzberg, S., “A system for induction of oblique decision trees,” J. Artificial Intell. Res., vol. 2, pp. 1–32, 1994.
[12]. Taylor, P.C. & Silverman, B.W. “Block diagrams and splitting criteria for classification trees,” Statistics and Computing, V 3, p. 147-161.,1993.
[13]. Wilkinson, L. and Dallal, G.E. (1981) 'Tests of significance in forward selection regression with an F-to enter stopping rule.' Technometrics. 23. 377-380.
[14]. Yildiz, O. T.,and Alpaydin, E., “Omnivariate decision trees,” IEEE Trans. Neural Netw., vol. 12, no. 6, pp. 1539–1546, Nov. 2001.
[15]. Yildiz, O. T., “Linear Discriminant Trees,” in Proc. 17 th Int. Conf. Machine Learning, pp.1175-1182. , 2000.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23021-
dc.description.abstract分類樹(Classification tree)是一種常用於資料探勘的分類方法,透過一連串選擇適當的屬性(attribute)並將資料作分割,已達到分類的結果。但持續對資料做分割會造成樣本數迅速減少,將造成分類樹下層的估計較為不可靠。另外,當反應變數(response)與數個屬性之間呈現線性關係時,傳統分類樹也無法提供有效的分類結果。另一種常用於多變量分析的分類方法為費雪判別(Fisher's Linear Discriminant),此方法尋找屬性之間的最佳線性組合,已達到能將各類別作最適當的分類,但此方法無法適用於資料非線性關係。
為解決上述所提兩種分類方法的缺失,本研究提出一個新的分類方法-多變量分類樹 (Multivariate classification tree)。此方法因應不同的資料結構,選擇適當的分類方式。當資料屬線性關係時,選擇一組屬性的線性組合做分割,此時不僅能對資料做更精確的描述,並避免傳統方法因多次分割所造成樣本數銳減的問題。若資料非屬線性關係,則選擇傳統以單一屬性做分割的分類方式。
本研究所提出的多變量分類樹中,包含一個選擇適當屬性的方法,以及單一屬性及多個屬性的衡量比較。另外,本研究導入費雪判別及馬氏距離(Mahalanobis distance)的概念,同時考慮反應變數及屬性的分布情況,以選擇最適當的決策條件(conditional clause)。
為驗證本研究所提出之多變量分類樹,透過模擬產生的資料,與傳統的分類方法比較。證明此方法能有效的處理各種結構的資料,並得到準確的結果。
zh_TW
dc.description.abstractClassification tree is a very common technique in data mining. It is built through selecting the appropriate attribute and sequentially splitting the sample into subsets. However, the sample size reduces sharply after few levels of splitting, and results in unreliable prediction. In addition, the classification tree could not provide accurate result efficiently for data with multivariate structure.
Therefore, we propose a multivariate classification tree method to deal with different kinds of data structures. The objective is to choose the appropriate conditional clause that can capture the data character well. The proposed tree will employ a linear combination of multiple attributes if needed to avoid unnecessary sample size reduction and to obtain a more accurate tree model.
To build the multivariate tree, we propose a systematic methodology to select the relevant attributes and to evaluate, compare and select the univariate model and multivariate model. In addition, we incorporate the idea of Fisher’s linear discriminant and Mahalanobis distance so that the conditional clause will take into accounts the data distributions of both the response and the attributes.
To validate the proposed method, we compare that with other classification methods via simulated data and the real cases. It is shown that the new method can capture different data structures with acceptable accuracy.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T04:38:08Z (GMT). No. of bitstreams: 0
Previous issue date: 2009
en
dc.description.tableofcontents中文摘要 I
Abstract II
Contents IV
Contents of Figures VI
Contents of Tables VII
Chapter 1 Introduction 1
1.1 Background 1
1.2 Classification Method Review 2
1.2.1 Classification and Regression Trees 2
1.2.2 Fisher’s Linear Discriminant 4
1.2.3 Reviews of Multivariate Classification Trees 7
1.3 Motivation 9
1.4 Research Objective 11
1.5 Thesis Organization 12
Chapter 2 Multivariate Classification Trees 13
2.1 Tree Construction Criterion 13
2.1.1 CART v.s. FLD Criterion 13
2.1.2 Attribute Selection 22
2.1.3 Cut Point 24
2.1.4 Model Evaluation 28
2.1.5 Discriminanbility Test 29
2.2 Tree Construction 30
2.2.1 Univariate v.s. Multivariate 31
2.2.2 The Complete Procedure 33
2.3 Criterion Decision 35
Chapter 3 Case Study 40
3.1 Simulation Cases 40
3.1.1 Linear Structure 40
3.1.2 Tree Structure 42
3.1.3 Hybrid Structure 45
3.2 Real case 50
Chapter 4 Conclusions 53
Reference 55
Appendix 55
dc.language.isoen
dc.subject屬性選擇zh_TW
dc.subject分類樹zh_TW
dc.subject費雪判別zh_TW
dc.subject馬氏距離zh_TW
dc.subject多變量zh_TW
dc.subjectMultivariateen
dc.subjectMahalanobis distanceen
dc.subjectAttribute selectionen
dc.subjectClassification treeen
dc.title多變量分類樹之建構與應用zh_TW
dc.titleConstruction of Multivariate Classification Trees and Its Applicationsen
dc.typeThesis
dc.date.schoolyear97-2
dc.description.degree碩士
dc.contributor.oralexamcommittee楊烽正,范治民,歐陽彥正
dc.subject.keyword分類樹,費雪判別,馬氏距離,多變量,屬性選擇,zh_TW
dc.subject.keywordClassification tree,Mahalanobis distance,Multivariate,Attribute selection,en
dc.relation.page67
dc.rights.note未授權
dc.date.accepted2009-08-17
dc.contributor.author-college工學院zh_TW
dc.contributor.author-dept工業工程學研究所zh_TW
Appears in Collections:工業工程學研究所

Files in This Item:
There are no files associated with this item.
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved