具有錯誤發現率和型一誤差控制的可解釋之預測樹模型

Cheng-En Hong; 洪晟恩

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20526

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	歐陽彥正
dc.contributor.author	Cheng-En Hong	en
dc.contributor.author	洪晟恩	zh_TW
dc.date.accessioned	2021-06-08T02:51:53Z	-
dc.date.copyright	2017-08-24
dc.date.issued	2017
dc.date.submitted	2017-08-14
dc.identifier.citation	[1] Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., and Rudin, C. (2017). Learning Certifiably Optimal Rule Lists for Categorical Data. arXiv preprint arXiv:170401701. [2] Barber, R. F. and Candes, E. J. (2016). A knockoff filter for high-dimensional selective inference. arXiv preprint arXiv:1602.03574. [3] Barber, R. F., Candès, E. J., et al. (2015). Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43(5):2055–2085. [4] Beauchamp, N. (2017). Predicting and interpolating state-level polls using twitter textual data. American Journal of Political Science, 61(2):490–503. [5] Benjamini, Y. (2010). Discovering the false discovery rate. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4):405–416. [6] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), 57(1):289–300. [7] Bertsimas, D. and Dunn, J. (2017). Optimal classification trees. Machine Learning, 106(7):1–44. [8] Breiman, L. (2001). Statistical modeling: The two cultures. Statistical Science, 16(3):199–215. [9] Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984). Classification and regression trees. CRC press. [10] Breiman, L. and Shang, N. (1996). Born again trees. University of California, Berkeley, Berkeley, CA, Technical Report. [11] Brzyski, D., Peterson, C. B., Sobczyk, P., Candes, E. J., Bogdan, M., and Sabatti, C. (2017). Controlling the rate of gwas false discoveries. Genetics, 205(1):61–75. [12] Bühlmann, P. (2011). Invited discussion on ”regression shrinkage and selection via the lasso: a retrospective (r.tibshirani)”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3):277–279. [13] Candes, E., Fan, Y., Janson, L., and Lv, J. (2016). Panning for gold: Modelfree knockoffs for high-dimensional controlled variable selection. arXiv preprint arXiv:1610.02351. [14] Chen, J., Hou, A., and Hou, T. Y. (2017). Some Analysis of the Knockoff Filter and its Variants. arXiv preprint arXiv:170603400. [15] Deng, H. (2014). Interpreting tree ensembles with intrees. arXiv preprint arXiv:1408.5456. [16] Doshi-Velez, F. and Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning. arXiv preprint arXiv:170208608. [17] Fan, J., Song, R., et al. (2010). Sure independence screening in generalized linear models with np-dimensionality. The Annals of Statistics, 38(6):3567–3604. [18] Friedman, J., Hastie, T., Tibshirani, R., et al. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28(2):337–407. [19] Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 29(5):1189–1232. [20] Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4):367–378. [21] Friedman, J. H. and Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3):916–954. [22] Goodman, B. and Flaxman, S. (2016). European union regulations on algorithmic decision-making and a” right to explanation”. arXiv preprint arXiv:1606.08813. [23] Hara, S. and Hayashi, K. (2016a). Making tree ensembles interpretable. arXiv preprint arXiv:1606.05390. [24] Hara, S. and Hayashi, K. (2016b). Making tree ensembles interpretable: A bayesian model selection approach. arXiv preprint arXiv:1606.09066. [25] Ioannidis, J. P. (2005). Why most published research findings are false. PLos Med,2(8):e124. [26] Ioannidis, J. P. (2016). Why most clinical research is not useful. PLoS Med,13(6):e1002049. [27] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. Journal of Machine Learning Research, 15(1):2869–2909. [28] Julia Angwin, Jeff Larson, S. M. and Kirchner, L. (2016). Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. URL https://www.propublica.org/article/machine-bias-risk-assessments-incriminal-sentencing. [29] Letham, B., Rudin, C., McCormick, T. H., Madigan, D., et al. (2015). Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3):1350–1371. [30] Lim, M. and Hastie, T. (2015). Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics, 24(3):627–654. [31] Meinshausen, N. et al. (2010). Node harvest. The Annals of Applied Statistics, 4(4):2049–2072. [32] Menickelly, M., Gunluk, O., Kalagnanam, J., and Scheinberg, K. (2016). Optimal Generalized Decision Trees via Integer Programming. arXiv preprint arXiv:161203225. [33] Montgomery, J. M. and Olivella, S. (2017). Tree-based models for political science data. To be appeared in American Journal of Political Science. [34] Quinlan, J. R. (2014). C4. 5: programs for machine learning. Elsevier. [35] Reid, S. and Tibshirani, R. (2016). Sparse regression and marginal testing using cluster prototypes. Biostatistics, 17(2):364–376. [36] Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1135–1144. ACM. [37] Rivest, R. L. (1987). Learning decision lists. Machine learning, 2(3):229–246. [38] Sesia, M., Sabatti, C., and Candès, E. J. (2017). Gene Hunting with Knockoffs for Hidden Markov Models. arXiv preprint arXiv:170604677. [39] Su, W., Bogdan, M., and Candes, E. (2015). False discoveries occur early on the lasso path. arXiv preprint arXiv:1511.01957. [40] Sur, P., Chen, Y., and Candès, E. J. (2017). The Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square. arXiv preprint arXiv:170601191. [41] Tan, H. F., Hooker, G., and Wells, M. T. (2016). Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable. arXiv preprint arXiv:161107115. [42] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288. [43] Tibshirani, R. J. (2014). In praise of sparsity and convexity. Past, Present, and Future of Statistical Science, pages 497–505. [44] Tong, X., Feng, Y., and Li, J. J. (2016). Neyman-Pearson (NP) classification algorithms and NP receiver operating characteristic (NP-ROC) curves. arXiv preprint arXiv:160803109. [45] Tong, X., Feng, Y., and Zhao, A. (2016). A survey on neyman-pearson classification and suggestions for future research. Wiley Interdisciplinary Reviews: Computational Statistics, 8(2):64–81. [46] Ustun, B. and Rudin, C. (2016a). Learning optimized risk scores on large-scale datasets. arXiv preprint arXiv:1610.00168. [47] Ustun, B. and Rudin, C. (2016b). Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 102(3):349–391. [48] Wainwright, M. (2015). Statistical Learning with Sparsity: The Lasso and Generalizations. Taylor & Francis. [49] Wang, F. and Rudin, C. (2015a). Causal falling rule lists. arXiv preprint arXiv:1510.05189. [50] Wang, F. and Rudin, C. (2015b). Falling rule lists. In Artificial Intelligence and Statistics, pages 1013–1022. [51] Yang, H., Rudin, C., and Seltzer, M. (2016). Scalable bayesian rule lists. arXiv preprint arXiv:1602.08610. [52] Zeng, J., Ustun, B., and Rudin, C. (2017). Interpretable classification models for recidivism prediction. Journal of the Royal Statistical Society: Series A (Statistics in Society), 180(3):689–722. [53] Zhao, A., Feng, Y., Wang, L., and Tong, X. (2015). Neyman-Pearson Classification under High-Dimensional Settings. arXiv preprint arXiv:150803106. [54] Zhao, Q. and Hastie, T. (2017). Causal interpretations of black-box models. To be appeared in Journal of Business & Economic Statistics. [55] Zhou, Y. and Hooker, G. (2016). Interpreting models via single tree approximation. arXiv preprint arXiv:1610.09036.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20526	-
dc.description.abstract	在實際的問題中儘管擁有許多的變數，我們並不曉得哪些變數是真實的變數，哪些是虛假的雜訊。通過發現重要變數，研究人員可以進一步利用選擇的重要變數進行更有針對性的後續實驗以利探討背後的科學現象。一個自然的要求是，我們希望盡可能發現更多的相關變量，同時盡可能犯更少的錯誤。我們提出一個改良的RuleFit 模型，其中包含利用knockoff procedure 達到控制錯誤發現率, 以及通過 Neyman-Pearson 方法控制型一誤差。	zh_TW
dc.description.abstract	Despite the abundance of the available variables, ground truth is privy to knowledge about the problem seldom revealed in practice. By discovering important features, researchers can further conduct a more targeted follow-up experiment on the selected features tailored for understanding the scientific phenomenon. A natural requirement is that we wish to discover as many relevant variables as possible and make as few mistakes as possible at the same time. We propose a modified RuleFit with FDR control by knockoff procedure and with alpha control by Neyman-Pearson method.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T02:51:53Z (GMT). No. of bitstreams: 1 ntu-106-R04H41006-1.pdf: 1211308 bytes, checksum: 7d01f9a5e7a176550195e56f958ce3e9 (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii Notations ix 1 Introduction 1 1.1 Literature Review 2 1.2 Background 7 1.2.1 RuleFit 7 1.2.2 The Lasso 9 1.3 Motivation 12 1.4 Framework of Thesis 12 2 Methods 14 2.1 Knockoff Procedure 14 2.1.1 Preliminaries and Notations 14 2.1.2 Construct Knockoffs 18 2.1.3 Calculate Feature Statistics 21 2.1.4 Calculate a Data-Dependent Threshold 25 2.1.5 Two-Stage Modification 30 2.1.6 Summary 31 2.2 Neyman-Pearson Method 31 2.2.1 Preliminaries and Notations 31 2.2.2 Neyman-Pearson Umbrella Algorithm 33 3 Results and Discussion 38 3.1 Knockoff Procedure 38 3.1.1 Knockoff Result Summary 41 3.2 Neyman-Pearson Method 41 3.2.1 Neyman-Pearson Method Result Summary 42 3.3 Real Data Analysis 45 3.3.1 Real Data Analysis Result Summary 46 4 Conclusion 47 References 49
dc.language.iso	en
dc.title	具有錯誤發現率和型一誤差控制的可解釋之預測樹模型	zh_TW
dc.title	A tree-based interpretable predictive method with FDR and type-one error control	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	韓謝忱,蔡政安
dc.subject.keyword	模型選擇,錯誤發現率,	zh_TW
dc.subject.keyword	Knockoff,FDR,Lasso,Neyman-Pearson method,	en
dc.relation.page	54
dc.identifier.doi	10.6342/NTU201702789
dc.rights.note	未授權
dc.date.accepted	2017-08-14
dc.contributor.author-college	共同教育中心	zh_TW
dc.contributor.author-dept	統計碩士學位學程	zh_TW
顯示於系所單位：	統計碩士學位學程

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 目前未授權公開取用	1.18 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。