請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21486
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 歐陽彥正 | |
dc.contributor.author | Yen-Jung Liu | en |
dc.contributor.author | 劉晏茸 | zh_TW |
dc.date.accessioned | 2021-06-08T03:35:30Z | - |
dc.date.copyright | 2019-08-07 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-07-30 | |
dc.identifier.citation | [1] Abramson, I.S. (1982). On bandwidth variation in kernel estimates. A square root law. The Annals of Statistics, vol. 10, pp. 1217-1223.
[2] Arun K.P.M., Chitra D.B., Karthick P., Ganesan M., Madhan A.S. (2017). Dengue disease prediction using decision tree and support vector machine. SSRG International Journal of Computer Science and Engineering,(Special Issue), pp. 60-63 [3] Bradford J.P., Kunz C., Kohavi R., Brunk C., and Brodley C.E. (1998). Pruning decision trees with misclassication costs. In Proceedings of the European Conference on Machine Learn, pp. 131-136. [4] Bradley A.P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition. vol. 30, no. 7, pp. 1145–1159. [5] Breiman, Leo; Friedman, J. H., Olshen, R. A., Stone, C. J. (1984). Classification and regression trees. Belmont, C.A., Wadsworth [6] Buntine W. (1991). A theory of learning classification rules. Ph.D. thesis, School of Computer Science, University of Technology, Sydney, Australia. [7] Bichutskiy V.Y. (2011). A Pooled Two-Sample Median Test Based on Density Estimation Log-transform kernel density estimation of income distribution. Journal of Modern Applied Statistical Methods. vol. 10, no. 4, pp. 692-698. [8] Charpentier A., Flachaire E., Log-Transform Kernel Density Estimation of Income Distribution, https://ssrn.com/abstract=2514882 [9] Chantawee A., Soonwera M. (2018). Efficacies of four plant essential oils as larvicide, pupicide and oviposition deterrent agents against dengue fever mosquito, Aedes aegypti Linn. (Diptera: Culicidae). Asian Pac J Trop Biomed, vol. 8, pp. 217–225. [10] Chou L.C. (2018). Density Estimation in High Dimensions Using Distance to K Nearest Neighbors. Ph.D. thesis, College of Electrical Engineering and Computer Science, National Taiwan University. [11] Cowell F.A., Flachaire E. (2007). Income distribution and inequality measurement: The problem of extreme values. Journal of Econometrics, vol.141, pp. 1044–1072. [12] Cowell F.A., Flachaire E. (2015). Statistical methods for distributional analysis. Handbook of Income Distribution, Vol. 2A, Chap. 6. New York: Elsevier Science B.V. [13] Davidson R., Flachaire E. (2007). Asymptotic and bootstrap inference for inequality and poverty measures. Journal of Econometrics vol.141, pp. 141–166. [14] Davidson R. (2012). Statistical inference in the presence of heavy tails. The Econometrics Journal, vol. 15, C31–C53. [15] Good I. J. (1965) The estimation of probabilities: An essay on modern Bayesian methods. MIT Press, Cambridge, MA. [16] Gubler D.J., Clark G.G. (1995). Dengue/dengue hemorrhagic fever: the emergence of a global health problem. Emerg. Infect. Dis., vol. 1, no. 2, pp. 55–57 [17] Gubler D.J. (2016). The spread of dengue fever. http://www.actionbioscience.org/environment/gubler.html [18] Guzman M.G., Harris E. (2015). Dengue. The Lancet, vol.385, pp. 385-453. [19] Henrard S., Speybroeck N., Hermans C. (2015). Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia. Haemophilia, vol. 21, pp. 715–722. [20] Harrell F.E.Jr, Lee K.L., Mark D.B. (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statist. Med., vol.15, pp. 361–387. [21] Jelinek T. (2000). Dengue fever in international travelers. Clin. Infect. Dis., vol. 31, pp. 144-147. [22] Kim J.S., Scott C.D. (2012). Robust kernel density estimation. Journal of Machine Learning Research, vol. 13, pp. 2529–2565. [23] Lemon S.C., Roy J., Clark M.A., Friedmann P.D., Rakowski W. (2003). Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann. Behav. Med., vol. 26, pp. 172–181. [24] Lewis R.J. (2000). An introduction to Classification and Regression Tree (CART) analysis. Annual Meeting of the Society for Academic Emergency Medicine, San Francisco, CA. [25] Mr. Opengate (2015). https://mropengate.blogspot.com/2015/06/ai-ch13-2-decision-tree.html [26] Margineantu, D.D., Dietterich, T.G. (2001). Improved class probability estimates from decision tree models. Nonlinear Estimation and Classification, pp. 173-188. [27] New Delhi (2011). Comprehensive guideline: prevention and control of dengue and dengue hemorrhagic fever. WHO Reg Publ SEARO. [28] Niblett, T. (1987). Constructing decision trees in noisy domains. Proceedings of the Second European Working Session on Learning, pp. 67–78. [29] Oyang Y.J., Hwang S.C., Ou Y.Y., Chen C.Y., Chen Z.W. (2005). Data classification with radial basis function networks based on a novel kernel density estimation algorithm. IEEE Trans. Neural Netw., vol. 16, pp. 225–236. [30] Pazzani M., Merz C., Murphy P., Ali K., Hume T., Brunk C. (1994). Reducing misclassification costs. Proceedings of the Eleventh International Conference on Machine Learning, pp. 217–225. [31] Phakhounthong K., Chaovalit P., Jittamala P., et al. (2018). Predicting the severity of dengue fever in children on admission based on clinical features and laboratory indicators: application of classification tree analysis. BMC Pediatr., vol. 18, no. 1 p. 109. [32] Potts J.A., Gibbons R.V., Rothman A.L., Srikiatkhachorn A., Thomas S.J., et al. (2010). Prediction of dengue disease severity among pediatric Thai patients using early clinical laboratory indicators. PLoS Negl. Trop. Dis., vol. 4: e769. [33] Provost F., Domingos P. (2003). Tree induction for probability-based ranking. Mach. Learn., vol. 52, pp. 199–215. [34] Provost, F., Domingos P. (2000). Well-trained PETs: Improving Probability Estimation Trees. CeDER Working Paper #IS-00-04, Stern School of Business, New York University. [35] Provost F., Fawcett T., Kohavi R. (1998). The Case against Accuracy Estimation while Comparing Induction Algorithms. Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453. [36] Shaukat K., Masood N., Mehreen S., Azmeen U. (2015). Dengue fever prediction: A data mining problem. Journal of Data Mining in Genomics and Proteomics, vol.6, pp. 1-5 [37] Silverman B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall. [38] Sitepu F.Y., Nasution H, Supriyadi T, Depari E(2016). Epidemiological and Entomological Investigation of Dengue Fever Outbreak in South Nias District, North Sumatera Province, Indonesia. Outbreak, Surveillance, Investigation & Response, vol.11, pp. 8-12. [39] Smyth P., Gray A., Fayyad U. (1995). Retrofitting Decision Tree Classifiers using Kernel Density Estimation. Proc. 12th Int’l Conf. Machine Learning, pp. 506-514 [40] Sugumaran V., Muralidharan V., Ramachandran K. (2007). Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing. Mechanical Systems and Signal Processing, vol. 21, pp.930–942. [41] Tanner L., Schreiber M., Low J.G., Ong A., Tolfvenstam T., Lai Y.L., et al. (2008) Decision tree algorithms predict the diagnosis and outcome of dengue fever in the early phase of illness. PLoS Negl. Trop., vol. 2 : e196. [42] Tarmizi N.D., Jamaluddin F., Bakar A.A., Othman Z.A., Hamdan A.R. (2013). Classification of dengue outbreak using data mining models. Research Notes in Information Science, vol.12, pp.71-75. [43] Thitiprayoonwongse D.A., Suriyaphol P.R., Soonthornphisaj N.U. (2012). Data mining of dengue infection using decision tree. WSEAS international conference on Latest Advances in Information Science and Applications, Singapore, pp. 154-159. [44] Yang C.C. (2019). Kernel Density Based Probability Estimation for Data Classifiers. Master thesis, Master Program in Statistics, Center for General Education, National Taiwan University. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21486 | - |
dc.description.abstract | 在眾多機器學習方法中,決策樹(decision tree)的流程圖有別於許多機器學習方法為黑盒子模型(black-box model),擁有高解讀性的優點。然而在決策樹中僅能提供該組別粗略的機率估計,其中以頻率估計法及Laplace機率估計法最廣為人知,但這兩種機率的估計方法僅能提供每個節點內的測試資料(testing data)有相同的機率估計,無法得知節點內每個人的差異化機率估計,機率估計樹(probability estimation tree),提供節點內每個人擁有差異化的機率估計值。本篇實驗中嘗試了六種不同的模擬資料集,導入自適應核心密度(adaptive kernel density estimation)估計建構機率估計樹,相較於固定帶寬的核心密度估計,在兩群體間分類模糊地帶間,最佳的改善情況為誤差下降了約31%。在應用上,機率估計樹的方法在公共衛生領域偵測登革熱(Dengue fever)上,可以幫助醫生在更短的時間內了解病人被預測為登革熱的狀況。 | zh_TW |
dc.description.abstract | Apart from many machine learning methods, the decision rule are black-box models. Decision tree’s flow chart has the advantage of high interpretation While, decision tree produces poor class probability estimation. Among the methods of probability estimation, the frequency estimation method and the Laplace probability estimation method are the most widely known, but these two methods can only provide the same probability estimation for the testing data in each node. Probability estimation tree provides differentiate probability for every individual person in one node. In our study, combining decision tree and adaptive kernel density estimation, at the fuzzy zone between the two groups ,among six different simulated data sets the best improvement in probability estimation against fixed kernel density estimation is 31% error reduction approximately. In application, machine learning methods in dengue fever detection, can support doctors to grasp the situation of patients predicted to be dengue in a shorter period of time. | en |
dc.description.provenance | Made available in DSpace on 2021-06-08T03:35:30Z (GMT). No. of bitstreams: 1 ntu-108-R06h41008-1.pdf: 1223739 bytes, checksum: ae14699b8b66e069798c120a2e3a17b7 (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | 摘要 I
圖目錄 IV 表目錄 V 第一章 緒論 1 摘要 I 圖目錄 IV 表目錄 V 第一章 緒論 1 1.1 研究背景與動機 1 1.2 內文架構 2 第二章 研究方法 4 2.1 過去文獻回顧 4 2.2 論文採用方法 7 第三章 實驗模擬 8 3.1 模擬資料 8 3.2 實驗結果 10 第四章 運用登革熱資料進行實證分析 13 4.1 資料庫簡介 13 4.2 研究工具 15 4.3 資料降維 15 4.4 研究基礎參數設定 17 4.5 應用登革熱資料結果 19 第五章 總結與未來建議 21 補充圖表 22 參考文獻 28 | |
dc.language.iso | zh-TW | |
dc.title | 導入自適應帶寬核心密度估計以建構機率估計樹 | zh_TW |
dc.title | Combining decision tree and adaptive kernel density estimation to construct probability estimation tree | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 韓謝忱,王榮德,金傳春 | |
dc.subject.keyword | 核心密度估計,解釋性,機率估計樹,登革熱,決策樹, | zh_TW |
dc.subject.keyword | kernel density estimation,interpretable,probability estimation tree,dengue fever,decision tree, | en |
dc.relation.page | 31 | |
dc.identifier.doi | 10.6342/NTU201902114 | |
dc.rights.note | 未授權 | |
dc.date.accepted | 2019-07-31 | |
dc.contributor.author-college | 共同教育中心 | zh_TW |
dc.contributor.author-dept | 統計碩士學位學程 | zh_TW |
顯示於系所單位: | 統計碩士學位學程 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 1.2 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。