預測模型效能之評估:預測曲線及其幾何摘要

邱偉珉; Wei-Min Chiu

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99854

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	李文宗	zh_TW
dc.contributor.advisor	Wen-Chung Lee	en
dc.contributor.author	邱偉珉	zh_TW
dc.contributor.author	Wei-Min Chiu	en
dc.date.accessioned	2025-09-19T16:06:03Z	-
dc.date.available	2025-09-20	-
dc.date.copyright	2025-09-19	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-08	-
dc.identifier.citation	Vogenberg FR. Predictive and prognostic models: implications for healthcare decision-making in a modern recession. American Health & Drug Benefits. 2009;2(6):218-222. Smeden MV, Reitsma JB, Riley RD, Collins GS, Moons KG. Clinical prediction models: diagnosis versus prognosis. Journal of Clinical Epidemiology. 2021;132:142-145. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, and Kattan MW. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128-138. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36. Huang Y, Pepe MS, Feng Z. Evaluating the predictiveness of a continuous marker. Biometrics. 2007;63(4):1181-1188. Lee WC. Probabilistic analysis of global performances of diagnostic tests: interpreting the Lorenz curve‐based summary measures. Statistics in Medicine. 1999;18(4):455-471. Wu YC, Lee WC. Alternative performance measures for prediction models. PLOS One. 2014;9(3):e91249. Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Statistics Surveys. 2010;4:40-79. Jiang X, Osl M, Kim J, Machado LO. Smooth isotonic regression: a new method to calibrate predictive models. AMIA Summits on Translational Science Proceedings. 2011:16-20. Tibshirani RJ, Efron B. An introduction to the bootstrap. 1st ed. New York: Chapman and Hall; 1993. Chiang CJ, Wang YW, Lee WC. Taiwan’s nationwide cancer registry system of 40 years: past, present, and future. Journal of the Formosan Medical Association. 2019;118(5):856-858. Kao CW, Chiang CJ, Lin LJ, Huang CW, Lee WC, Lee MY, Taiwan Society of Cancer Registry Expert Group. Accuracy of long-form data in the Taiwan cancer registry. Journal of the Formosan Medical Association. 2021;120(11):2037-2041. Zhou XH, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. 3rd ed. New York: John Wiley & Sons; 2014. Mauguen A, Begg CB. Using the Lorenz curve to characterize risk predictiveness and etiologic heterogeneity. Epidemiology. 2016;27(4):531-537. Maiga A, Farjah F, Blume J, Deppen S, Welty VF, D’Agostino RS, Colditz GA, Kozower BD, Grogan EL. Risk prediction in clinical practice: a practical guide for cardiothoracic surgeons. Annals of Thoracic Surgery. 2019;108(5):1573-1582. Gleiss A. Visualizing a marker’s degrees of necessity and of sufficiency in the predictiveness curve. BMC Medical Research Methodology. 2025;25(1):107.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99854	-
dc.description.abstract	預測模型是臨床醫學與公共衛生中不可或缺的重要工具，有助於疾病風險的估計與決策的支持。傳統的評估方法，例如接收者操作特徵曲線及其下面積，主要評估模型的區分能力，但在評估族群層級的風險分層方面提供的資訊有限。最初為了評估生物標記而提出的預測曲線，能以視覺方式呈現預測風險在整體族群中的分布情形，但在多變量預測模型的應用中仍較少被探討。本研究將預測曲線的方法延伸至多變量風險預測模型的評估中，系統性地探討其幾何特性，並導出三個互補的效能指標：Pietra指標、Gini指標，以及標準化的 Brier 分數。這三個指標分別量化模型在解析中間風險個案、區分預測風險層級，以及提高預測確定性方面的能力。為了確保預測風險的校正與不偏性，本研究提出了一個三步驟流程，包括交叉驗證、保序迴歸校正，以及自助法平均。透過說明性範例及一項涉及台灣 23,839 名肺癌患者的真實應用案例，研究展示並驗證了此方法的可行性。Pietra指標、Gini指標，以及標準化的 Brier 分數分別捕捉到風險分層表現的不同面向。即使某些模型在其中一項指標上表現相同，其他指標仍可能呈現顯著差異，顯示這些指標具有互補性。在肺癌個案研究中，最終調整後的預測模型能有效區分死亡風險，將 25.1% 的患者歸類為極低風險（<10% 死亡率），50.1% 為高風險（>75% 死亡率），僅有 5.1% 的患者落在接近平均風險的「灰色區域」。該調整後模型的Pietra指標為 0.6719，Gini指標為 0.7850，標準化的 Brier 分數為 0.5186。不同細胞類型的肺癌 (肺腺癌、鱗狀細胞癌、小細胞癌和大細胞癌) 在預測表現上差異顯著，反映出不同細胞類型的肺癌在風險分層能力上的差異。預測曲線及其幾何效能指標提供一種強大、透明且以族群為導向的框架，用以超越傳統指標來評估多變量風險預測模型的表現。這些方法能清楚地展現模型如何進行風險分層及其影響到的族群比例，進而提升模型可解釋性，協助臨床醫師與研究人員更有效地優化與應用預測模型於臨床與公共衛生實務中。	zh_TW
dc.description.abstract	Prediction models are essential tools in clinical medicine and public health, facilitating disease risk estimation and supporting decision-making. Traditional evaluation methods such as the receiver operating characteristic (ROC) curve and its area under the curve (AUC) primarily assess discrimination but provide limited insight into population-level risk stratification. The predictiveness curve, originally proposed for biomarker evaluation, visually illustrates how predicted risks distribute across a population but has been underexplored in the context of multivariable prediction models. This study extends the predictiveness curve methodology to evaluate multivariable risk prediction models, systematically exploring its geometric properties and deriving three complementary performance indices: the Pietra index, Gini index, and scaled Brier score. These indices respectively quantify a model’s ability to resolve intermediate-risk cases, achieve separation among predicted risks, and enhance prediction certainty. A three-step procedure—cross-validation, isotonic regression calibration, and bootstrap averaging—was proposed to ensure calibration and unbiasedness of predicted risks. Illustrative examples and a real-world application involving 23,839 lung cancer patients from Taiwan were used to demonstrate and validate the methodology. The Pietra index, Gini index, and scaled Brier score captured distinct dimensions of risk stratification performance. Models with identical values of one index could differ markedly in the other indices, underscoring their complementary nature. In the lung cancer case study, the final adjusted prediction model effectively stratified patients across the fatality risk spectrum, identifying 25.1% of patients as very low-risk (<10% fatality) and 50.1% as high-risk (>75% fatality), with only 5.1% of patients falling within a “gray zone” near the average fatality risk. The adjusted model achieved a Pietra index of 0.6719, a Gini index of 0.7850, and a scaled Brier score of 0.5186. Substantial variation in predictive performance was observed among adenocarcinoma, squamous cell carcinoma, small cell carcinoma, and large cell carcinoma subtypes, reflecting differential risk stratification capabilities by cell type. The predictiveness curve and its geometric summaries—the Pietra index, Gini index, and scaled Brier score—provide a powerful, transparent, and population-oriented framework for evaluating the performance of multivariable risk prediction models beyond traditional metrics. By clearly illustrating how a model stratifies risk and for which proportion of the population, these methods enhance interpretability, supporting clinicians and researchers in refining and applying predictive models effectively in clinical and public health practice.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-19T16:06:03Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-09-19T16:06:03Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	National Taiwan University Ph.D. Dissertation Oral Defense Approval Form i Chinese Abstract ii English Abstract iii List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Predictiveness Curve: Construction, Geometry, and Performance Indices 3 Chapter 3 Performance Indices and Their Role in Risk Stratification 6 3.1 Gray-Zone Resolution: Pietra Index 6 3.2 Horizontal Risk Separation: Gini Index 6 3.3 Vertical Certainty of Prediction: Scaled Brier Score 7 Chapter 4 Ensuring Calibration and Unbiasedness in Prediction Models 9 4.1 Cross-Validation 9 4.2 Calibration 9 4.3 Bootstrap Averaging 10 Chapter 5 A Case Study: Five-Year Fatality Among Lung Cancer Patients in Taiwan 11 Chapter 6 Discussion 14 References 17 Appendices 26	-
dc.language.iso	en	-
dc.subject	預測模型評估	zh_TW
dc.subject	預測曲線	zh_TW
dc.subject	風險分層	zh_TW
dc.subject	Gini 指標	zh_TW
dc.subject	Pietra 指標	zh_TW
dc.subject	標準化的 Brier 分數	zh_TW
dc.subject	Pietra index	en
dc.subject	prediction model evaluation	en
dc.subject	predictiveness curve	en
dc.subject	risk stratification	en
dc.subject	scaled Brier score	en
dc.subject	Gini index	en
dc.title	預測模型效能之評估:預測曲線及其幾何摘要	zh_TW
dc.title	Evaluating Prediction Model Performance:The Predictiveness Curve and Its Geometric Summaries	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	博士	-
dc.contributor.oralexamcommittee	林先和;杜裕康;蕭朱杏;廖勇柏	zh_TW
dc.contributor.oralexamcommittee	Hsien-Ho Lin;Yu-Kang Tu;Chu-Hsing Hsiao;Yung-Po Liaw	en
dc.subject.keyword	預測模型評估,預測曲線,風險分層,Gini 指標,Pietra 指標,標準化的 Brier 分數,	zh_TW
dc.subject.keyword	prediction model evaluation,predictiveness curve,risk stratification,Gini index,Pietra index,scaled Brier score,	en
dc.relation.page	50	-
dc.identifier.doi	10.6342/NTU202504305	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-08-08	-
dc.contributor.author-college	公共衛生學院	-
dc.contributor.author-dept	流行病學與預防醫學研究所	-
dc.date.embargo-lift	2025-09-20	-
Appears in Collections:	流行病學與預防醫學研究所

Files in This Item:

File	Size	Format
ntu-113-2.pdf	1.49 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets