請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18747
標題: | 基因印記模式的預測性能評估 Evaluating the Performances of Gene Signature-based Predictions |
作者: | Ling-Yi Wang 王齡誼 |
指導教授: | 李文宗 |
關鍵字: | 接收者操作特徵曲線,基因印記,預測,學習曲線,排列檢定,驗證, receiver operating characteristic curve,gene signature,prediction,learning curve,permutation test,validation, |
出版年 : | 2014 |
學位: | 博士 |
摘要: | 背景:
由於DNA微陣列技術的快速發展,讓我們得以利用一個人的基因印記來做疾病類型診斷以及疾病/治療預後的預測。當為了預測而建構一個基因印記時, 常會例行地做基因篩選。然而,基因篩選對一個多基因決定的複雜性疾病不一定有助益。第二,即便交叉驗證或自助抽樣法來做內部驗證可以得到一個基因印記預測表現的合理評估,這些方法的訓練樣本數減少卻是個問題。第三,外部驗證是目前評估一個基因印記預測表現的典範。然而,外部驗證亦可能會受困於異質性的問題。 方法: 我們推導出一個基因印記的預測表現公式。根據此公式,我們探討了基因篩選的效果並提出一個從配適曲線做一步驟外插法來估計一個用全部樣本所建構的基因印記的預測表現。我們亦提出一個排列檢定法來偵測異質性。模擬研究則用於評估我們所提出的方法。三個DNA微陣列資料則被用於示範。 結果: 首先,我們發現一個最適基因篩選策略依賴於給定固定樣本數下的基因信噪比與信號強度。而這之中,存在著一個非篩選區使得任何基因篩選程序僅會降低預測表現。第二,我們的一步驟外插法在偏差與變異數之間取得良好的平衡且有小的均方根誤差。最後,排列檢定法的型一誤差非常接近公認的顯著水準,且檢定力隨著樣本數增加而增加。 結論: 本研究所提出的方法在發展及評估以基因印記為基礎的診斷/預後預測模式應可證明其實用性。 BACKGROUND Microarray technology enables us to make diagnostic and prognostic predictions based on a subject’s gene signature. When building a gene signature for prediction, gene selection is routinely performed. However, gene selection may not be beneficial in polygenic complex diseases. Second, while internal validation by cross-validating or bootstrapping can obtain a fair assessment of the prediction performance of a gene signature, its reduced training sample size is a problem. Third, external validation is the current paradigm for performance evaluation of a gene signature. However, external validation may suffer from the heterogeneity problem. METHODS We derive a learning curve for the prediction performance of a gene signature. Based on it, we study the effect of gene selection and propose a one-step extrapolation method to estimate the prediction performance of the gene signature trained by all the samples. We also propose a permutation test to detect heterogeneity. Simulation studies are implemented to evaluate the proposed methods. Three microarray datasets are used for demonstration. RESULTS First, we found that an optimal gene selection strategy depends on the signal-to-noise ratio and the signal strength given a fixed sample size, and that there exists a no-selecting zone where any gene selection procedure within the zone will jeopardize the prediction performance. Second, we found that the proposed one-step extrapolation method for internal validation strikes a good balance between bias and variance and has small root mean squared error. Finally, we found that the type I error of the proposed permutation test to detect heterogeneity in external validation is close to the nominal significance level, and the power increases when sample size increases. CONCLUSION The proposed three methods in this study should prove useful in developing and evaluating diagnostic/prognostic prediction models based on gene signatures of individual persons. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18747 |
全文授權: | 未授權 |
顯示於系所單位: | 流行病學與預防醫學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-103-1.pdf 目前未授權公開取用 | 3.73 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。