請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17363
標題: | 概括性相對性重要指標及變數選擇之研究及其於費雪線性區別分析與Cox比例風險迴歸之應用 Generalized Relative Importance and Its Applications to Fisher Linear Discriminant Analysis, Cox Proportional Hazards Modeling and Variable Selection |
作者: | Yen-Lung Wang 王彥龍 |
指導教授: | 陳正剛 |
關鍵字: | Dominance Index,Relative Weight,羅吉斯迴歸,費雪線性區別分析,Cox比例風險迴歸分析,變數選擇, Dominance Index,Relative Weight,Logistic Regression,Fisher Linear Discriminant Analysis,Cox Proportional Hazards Modeling,Variable Selection, |
出版年 : | 2013 |
學位: | 碩士 |
摘要: | 複迴歸分析是常被使用的統計方法,在複迴歸分析中常會遇到的問題就是共線性(collinearity),共線性是來自於變數之間的高度相關所引起。當變數間共線性程度小時,標準化迴歸係數即可代表各變數的相對重要性(relative importance), 但當共線性程度大時,沒辦法再使用標準化迴歸係數(standardize regression coefficient)來看變數間的相對重要性,因此文獻上使用Budescu(1993)[1]的Dominance Index 和Jonhson(2000)[5]的Relative Weight來處理共線性的問題,以找出變數之間的相對重要性,兩種方法所得到的結果十分相似,但是Dominance Index比Relative Weight的計算負擔還要多很多,十分不便。
傳統關於相對重要性的研究,著重在複迴歸分析上,對於其他線性模型,如羅吉斯迴歸(logistic regression)、費雪線性區別分析(fisher linear discriminant analysis)、Cox比例風險迴歸分析(Cox Proportional Hazards Modeling),仍沒有一個好的方法來尋找變數的相對重要性。因此在文獻中,Azen和Traxel(2009)[9]以及Tonidandel和M.LeBreton(2010)[10]分別使用了Dominance Index和Relative Weight找出了羅吉斯迴歸分析中變數的相對重要性,同樣的,兩者得到的結果相似,但是Dominance Index仍然比Relative Weight的計算負擔還要多很多。因此我們採用羅吉斯迴歸分析中使用Relative Weight尋找相對重要性的方法,推廣出概括性相對重要性指標(Generalized Relative Importance),並應用於費雪線性區別分析及Cox比例風險迴歸分析上。找出費雪線性區別分析及Cox比例風險迴歸分析的相對重要性後,進一步的結合各變數的partial-R^2,以及各變數一對一的R^2,找出一個變數選擇(Variable selection)的準則,以提供我們在資料內變數較多時如何選擇變數的方法。 本文接著使用模擬案例及腫瘤分類實際案例來分別驗證費雪線性區別分析及Cox比例風險迴歸分析中之相對重要性指標及變數選擇之研究。 Though multiple regression analysis is a widely used statistical method, there possibly exists the multi-collinearity problem that causes difficulties in determining the effect and the importance of each independent variable. The problem is caused by the high correlations among independent variables. When correlations among variables are low, the standardized regression coefficients can be plausibly used to represent the relative importance of the variable. The standardized regression coefficients are no longer useful for independent variables highly correlated to one another. In the literature, Budescu’s Dominance Index (1993)[1] and Jonhson’s Relative Weight (2000)[5] are used to resolve the collinearity problem and to find the relative importance of independent variables. It is also known that though the results obtained by the two methods are very similar, the computation of Dominance Index is much more demanding than the computation of Relative Weight. Most studies of variable relative importance in the literature have focused on the multiple regression analysis. For other linear models, such as Fisher linear discriminant analysis and Cox proportional hazards modeling, though suffering from the same problem, very limited studies can be found due to the lack of well-defined metrics. In this study, we use the method of Relative Weight to define the metrics of relative importance and propose a generalized relative importance for linear models in general. The proposed relative importance is then applied to Fisher linear discriminant analysis and Cox proportional hazards models. With the generalized relative importance, we further propose using the combination of the relative weight, the partial R2 and the univariate R2 to find a measure for selection of variables to build an effective model. Simulated cases and actual cases of tumor classification are used to demonstrate and validate the performance of the proposed relative importance and the variable selection method for Fisher linear discriminant analysis and Cox proportional hazards modeling. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17363 |
全文授權: | 未授權 |
顯示於系所單位: | 工業工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-102-1.pdf 目前未授權公開取用 | 4.79 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。