Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 公共衛生學院
  3. 流行病學與預防醫學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18747
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李文宗
dc.contributor.authorLing-Yi Wangen
dc.contributor.author王齡誼zh_TW
dc.date.accessioned2021-06-08T01:23:23Z-
dc.date.copyright2014-10-20
dc.date.issued2014
dc.date.submitted2014-08-04
dc.identifier.citation1. Cai YD, Huang T, Feng KY, Hu L, Xie L. A unified 35-gene signature for both subtype classification and survival prediction in diffuse large B-cell lymphomas. PLoS ONE 2010; 5: e12726.
2. Chibon F, Lagarde P, Salas S, Perot G, Brouste V, Tirode F, et al. Validated prediction of clinical outcome in sarcomas and multiple types of cancer on the basis of a gene expression signature related to genome complexity. Nat Med 2010; 16: 781-787.
3. Levan K, Partheen K, Osterberg L, Olsson B, Delle U, Eklind S, et al. Identification of a gene expression signature for survival prediction in type I endometrial carcinoma. Gene Expression 2010; 14: 361-370.
4. Roessler S, Jia HL, Budhu A, Forgues M, Ye QH, Lee JS, et al. A unique metastasis gene signature enables prediction of tumor relapse in early-stage hepatocellular carcinoma patients. Cancer Res 2010; 70: 10202-10212.
5. Wan YW, Sabbagh E, Raese R, Qian Y, Luo D, Denvir J, et al. Hybrid models identified a 12-gene signature for lung cancer prognosis and chemoresponse prediction. PLoS ONE 2010; 5: e12222.
6. Zhu CQ, Ding K, Strumpf D, Weir BA, Meyerson M, Pennell N, et al. Prognostic and predictive gene signature for adjuvant chemotherapy in resected non-small-cell lung cancer. J Clin Oncol 2010; 28: 4417-4424.
7. Chen DT, Hsu YL, Fulp WJ, Coppola D, Haura EB, Yeatman TJ, et al. Prognostic and predictive value of a malignancy-risk gene signature in early-stage non-small cell lung cancer. J Natl Cancer Inst 2011; 103: 1859-1870.
8. Herold T, Jurinovic V, Metzeler KH, Boulesteix AL, Bergmann M, Seiler T, et al. An eight-gene expression signature for the prediction of survival and time to treatment in chronic lymphocytic leukemia. Leukemia 2011; 25: 1639-1645.
9. Minguez B, Hoshida Y, Villanueva A, Toffanin S, Cabellos L, Thung S, et al. Gene-expression signature of vascular invasion in hepatocellular carcinoma. J Hepatol 2011; 55: 1325-1331.
10. Salazar R, Roepman P, Capella G, Moreno V, Simon I, Dreezen C, et al. Gene expression signature to improve prognosis prediction of stage II and III colorectal cancer. J Clin Oncol 2011; 29: 17-24.
11. Wang DY, Done SJ, McCready DR, Boerner S, Kulkarni S, Leong WL. A new gene expression signature, the ClinicoMolecular Triad Classification, may improve prediction and prognostication of breast cancer at the time of diagnosis. Breast Cancer Res 2011; 13: R92.
12. Xie Y, Xiao G, Coombes KR, Behrens C, Solis LM, Raso G, et al. Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clin Cancer Res 2011; 17: 5705-5714.
13. Riester M, Taylor JM, Feifer A, Koppie T, Rosenberg JE, Downey RJ, et al. Combination of a novel gene expression signature with a clinical nomogram improves the prediction of survival in high-risk bladder cancer. Clin Cancer Res 2012; 18: 1323-1333.
14. Schramm SJ, Campain AE, Scolyer RA, Yang YH, Mann GJ. Review and cross-validation of gene expression signatures and melanoma prognosis. J Invest Dermatol 2012; 132: 274-283.
15. Simon R. Diagnostic and prognostic prediction using gene expression profiles in high-dimensional microarray data. Br J Cancer 2003; 89: 1599-1604.
16. Anonymous. Challenges for the 21st century. Nat Genet 2001; 29: 353-354.
17. Moons KG, Kengne AP, Woodward M, Royston P, Vergouwe Y, Altman DG, et al. Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker. Heart 2012; 98: 683-690.
18. Refaeilzadeh P, Tang L, Liu H. Cross-validation. In Cross-validation. Springer, 2009.
19. Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: a comparison of resampling methods. Bioinformatics 2005; 21: 3301-3307.
20. Efron B. Estimating the error rate of a prediction rule: improvement on cross-validation. J Amer Statist Assoc 1983; 78: 316-331.
21. Efron B, Tibshirani RJ. An introduction to the bootstrap. CRC press, 1994.
22. Dougherty ER. Small sample issues for microarray-based classification. Comp Funct Genomics 2001; 2: 28-34.
23. Braga-Neto UM, Dougherty ER. Is cross-validation valid for small-sample microarray classification? Bioinformatics 2004; 20: 374-380.
24. Moons KG, Kengne AP, Grobbee DE, Royston P, Vergouwe Y, Altman DG, et al. Risk prediction models: II. External validation, model updating, and impact assessment. Heart 2012; 98: 691-698.
25. Toll DB, Janssen KJ, Vergouwe Y, Moons KG. Validation, updating and impact of clinical prediction rules: a review. J Clin Epidemiol 2008; 61: 1085-1094.
26. Vergouwe Y, Moons KG, Steyerberg EW. External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. Am J Epidemiol 2010; 172: 971-980.
27. Wolfe D, Hogg R. On constructing statistics and reporting data. Amer Statist 1971; 25: 27-30.
28. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: 29-36.
29. Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw 1999; 10: 988-999.
30. Karatzoglou A, Meyer D, Hornik K. Support vector machines in R. 2005.
31. Cohen J. Statistical power analysis for the behavioral sciences Lawrence Erlbaum Associates: Hillsdale, NJ, 1988.
32. Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 1999; 96: 6745-6750.
33. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005; 365: 671-679.
34. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 2006; 98: 262-272.
35. Bolstad BM, Irizarry RA, Åstrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003; 19: 185-193.
36. Tsai CA, Chen CH, Lee TC, Ho IC, Yang UC, Chen JJ. Gene selection for sample classifications in microarray experiments. DNA Cell Biol 2004; 23: 607-614.
37. Chen D, Liu Z, Ma X, Hua D. Selecting genes by test statistics. J Biomed Biotechnol 2005; 2005: 132-138.
38. Liu X, Krishnan A, Mondry A. An entropy-based gene selection method for cancer classification using microarray data. BMC Bioinformatics 2005; 6: 76-89.
39. Yang K, Cai Z, Li J, Lin G. A stable gene selection in microarray data analysis. BMC Bioinformatics 2006; 7: 228-243.
40. Chen JJ, Wang SJ, Tsai CA, Lin CJ. Selection of differentially expressed genes in microarray data analysis. Pharmacogenomics J 2007; 7: 212-220.
41. Consortium TWTCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007; 447: 661-678.
42. Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M, Mayer B, et al. Genomewide association analysis of coronary artery disease. N Engl J Med 2007; 357: 443-453.
43. Garcia-Closas M, Hall P, Nevanlinna H, Pooley K, Morrison J, Richesson DA, et al. Heterogeneity of breast cancer associations with five susceptibility loci by clinical and pathological characteristics. PLoS Genet 2008; 4: e1000054.
44. Benjamini Y, Hochberg Y. Controlling the false discovery rate:a practical and powerful approach to multiple testing. J R Stat Soc B 1995; 57: 289-300.
45. Liu R, Wang X, Chen GY, Dalerba P, Gurney A, Hoey T, et al. The prognostic role of a gene signature from tumorigenic breast-cancer cells. N Engl J Med 2007; 356: 217-226.
46. Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, Montgomery K, et al. Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol 2004; 2: 206-214.
47. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP. Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 2003; 43: 1947-1958.
48. Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, et al. Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol 2003; 10: 119-142.
49. Jiang W, Simon R. A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification. Stat Med 2007; 26: 5320-5334.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18747-
dc.description.abstract背景:
由於DNA微陣列技術的快速發展,讓我們得以利用一個人的基因印記來做疾病類型診斷以及疾病/治療預後的預測。當為了預測而建構一個基因印記時, 常會例行地做基因篩選。然而,基因篩選對一個多基因決定的複雜性疾病不一定有助益。第二,即便交叉驗證或自助抽樣法來做內部驗證可以得到一個基因印記預測表現的合理評估,這些方法的訓練樣本數減少卻是個問題。第三,外部驗證是目前評估一個基因印記預測表現的典範。然而,外部驗證亦可能會受困於異質性的問題。
方法:
我們推導出一個基因印記的預測表現公式。根據此公式,我們探討了基因篩選的效果並提出一個從配適曲線做一步驟外插法來估計一個用全部樣本所建構的基因印記的預測表現。我們亦提出一個排列檢定法來偵測異質性。模擬研究則用於評估我們所提出的方法。三個DNA微陣列資料則被用於示範。
結果:
首先,我們發現一個最適基因篩選策略依賴於給定固定樣本數下的基因信噪比與信號強度。而這之中,存在著一個非篩選區使得任何基因篩選程序僅會降低預測表現。第二,我們的一步驟外插法在偏差與變異數之間取得良好的平衡且有小的均方根誤差。最後,排列檢定法的型一誤差非常接近公認的顯著水準,且檢定力隨著樣本數增加而增加。
結論:
本研究所提出的方法在發展及評估以基因印記為基礎的診斷/預後預測模式應可證明其實用性。
zh_TW
dc.description.abstractBACKGROUND
Microarray technology enables us to make diagnostic and prognostic predictions based on a subject’s gene signature. When building a gene signature for prediction, gene selection is routinely performed. However, gene selection may not be beneficial in polygenic complex diseases. Second, while internal validation by cross-validating or bootstrapping can obtain a fair assessment of the prediction performance of a gene signature, its reduced training sample size is a problem. Third, external validation is the current paradigm for performance evaluation of a gene signature. However, external validation may suffer from the heterogeneity problem.
METHODS
We derive a learning curve for the prediction performance of a gene signature. Based on it, we study the effect of gene selection and propose a one-step extrapolation method to estimate the prediction performance of the gene signature trained by all the samples. We also propose a permutation test to detect heterogeneity. Simulation studies are implemented to evaluate the proposed methods. Three microarray datasets are used for demonstration.
RESULTS
First, we found that an optimal gene selection strategy depends on the signal-to-noise ratio and the signal strength given a fixed sample size, and that there exists a no-selecting zone where any gene selection procedure within the zone will jeopardize the prediction performance. Second, we found that the proposed one-step extrapolation method for internal validation strikes a good balance between bias and variance and has small root mean squared error. Finally, we found that the type I error of the proposed permutation test to detect heterogeneity in external validation is close to the nominal significance level, and the power increases when sample size increases.
CONCLUSION
The proposed three methods in this study should prove useful in developing and evaluating diagnostic/prognostic prediction models based on gene signatures of individual persons.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T01:23:23Z (GMT). No. of bitstreams: 1
ntu-103-D93842001-1.pdf: 3820114 bytes, checksum: ba3ca85a549c6d3b10c9ac0fcb690bc7 (MD5)
Previous issue date: 2014
en
dc.description.tableofcontentsCatalog
口試委員會審定書
中文摘要............................................................................................................................................i
英文摘要..........................................................................................................................................iii
Chapter 1. INTRODUCTION………………………….....…….........................................................1
Chapter 2. METHODS………………………….....…… ...................................................................3
2.1 Effect of Gene Selection on Prediction Performance…….................................................3
2.2 One-Step Extrapolation for Internal Validation ……………….........................................5
2.3 Permutation Test to Detect Heterogeneity in External Validation......................................7
Chapter 3. SIMULATION STUDIES............................................................... ..................................8
3.1 One-Step Extrapolation .....................................................................................................8
3.2 Permutation Test Method..................................................................................................11
Chapter 4. RESULTS .........................................................................................................................13
4.1 Effect of Gene Selection on Prediction Performance.......................................................13
4.2 One-Step Extrapolation for Internal Validation ...............................................................15
4.3 Permutation Test to Detect Heterogeneity in External Validation....................................17
Chapter 5. MICROARRAY DATA APPLICATION..........................................................................18
5.1 Effect of Gene Selection on Prediction Performance.......................................................18
5.2 One-Step Extrapolation for Internal Validation ...............................................................20
5.3 Permutation Test to Detect Heterogeneity in External Validation....................................23
Chapter 6. DISSCUSION..................................................................................................................24
Chapter 7. CONCLUSION................................................................................................................29
REFERENCES...................................................................................................................................40
APPENDIX........................................................................................................................................44
 
List of Figures
Figure 1. Contour lines of the optimal levels with respect to the signal-to-noise ratio and the
signal strengths ..................................................................................................................30
Figure 2. Contour lines of gene signature sizes with respect to the signal-to-noise ratio and the signal
strength to achieve an of 0.95....................................................................31
Figure 3. Contour lines of gene signature sizes with respect to the signal-to-noise ratio and the signal
strength to achieve an of 0.90....................................................................32
Figure 4. Comparison of the various methods under different sample size when the naive multiple
regression is used to build the gene signature......................................................................33
Figure 5. Comparison of the various methods under different sample size when the support vector
machine is used to build the gene signature........................................................................34
Figure 6. The role of training sample size, signal strength, and signal prevalence on the prediction
performance and gene selection using the microarray data................................................35
Figure 7. Demonstration of the two microarray data using the proposed extrapolation
method................................................................................................................................36
 
List of Tables
Table 1. Type I error of homogeneity testing for the model development dataset and external
validation dataset by the permutation test.............................................................................37
Table 2. Power of heterogeneity detection for the model development dataset and external validation
dataset by the permutation test..............................................................................................38
dc.language.isoen
dc.title基因印記模式的預測性能評估zh_TW
dc.titleEvaluating the Performances of Gene Signature-based Predictionsen
dc.typeThesis
dc.date.schoolyear102-2
dc.description.degree博士
dc.contributor.oralexamcommittee鄭光甫,程毅豪,洪弘,林菀俞
dc.subject.keyword接收者操作特徵曲線,基因印記,預測,學習曲線,排列檢定,驗證,zh_TW
dc.subject.keywordreceiver operating characteristic curve,gene signature,prediction,learning curve,permutation test,validation,en
dc.relation.page51
dc.rights.note未授權
dc.date.accepted2014-08-05
dc.contributor.author-college公共衛生學院zh_TW
dc.contributor.author-dept流行病學與預防醫學研究所zh_TW
顯示於系所單位:流行病學與預防醫學研究所

文件中的檔案:
檔案 大小格式 
ntu-103-1.pdf
  未授權公開取用
3.73 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved