請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82115完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 林菀俞(WAN-YU LIN) | |
| dc.contributor.author | Wei-Lun Weng | en |
| dc.contributor.author | 翁偉倫 | zh_TW |
| dc.date.accessioned | 2022-11-25T05:36:10Z | - |
| dc.date.available | 2024-09-01 | |
| dc.date.copyright | 2021-11-11 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-10-22 | |
| dc.identifier.citation | Abdolmaleky, H. M., Smith, C. L., Faraone, S. V., Shafa, R., Stone, W., Glatt, S. J., Tsuang, M. T. (2004). Methylomics in psychiatry: modulation of gene–environment interactions may be through DNA methylation. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 127(1), 51-59. Azur, M. J., Stuart, E. A., Frangakis, C., Leaf, P. J. (2011). Multiple imputation by chained equations: what is it and how does it work? International journal of methods in psychiatric research, 20(1), 40-49. Bibikova, M., Lin, Z., Zhou, L., Chudin, E., Garcia, E. W., Wu, B., . . . Fan, J. B. (2006). High-throughput DNA methylation profiling using universal bead arrays. Genome Res, 16(3), 383-393. doi:10.1101/gr.4410706 Biobank, T. (2015). Purpose of Taiwan Biobank. In. Casella, G., George, E. I. (1992). Explaining the Gibbs sampler. The American Statistician, 46(3), 167-174. Chai, T., Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific model development, 7(3), 1247-1250. Du, P., Zhang, X., Huang, C. C., Jafari, N., Kibbe, W. A., Hou, L., Lin, S. M. (2010). Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics, 11, 587. doi:10.1186/1471-2105-11-587 Enders, C. K. (2010). Applied missing data analysis: Guilford press. Friedman, J., Hastie, T., Tibshirani, R. (2009). glmnet: Lasso and elastic-net regularized generalized linear models. R package version, 1(4), 1-24. Frigola, J., Solé, X., Paz, M. F., Moreno, V., Esteller, M., Capellà, G., Peinado, M. A. (2005). Differential DNA hypermethylation and hypomethylation signatures in colorectal cancer. Human molecular genetics, 14(2), 319-326. Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., Rubin, D. B. (2013). Bayesian data analysis: CRC press. Hannum, G., Guinney, J., Zhao, L., Zhang, L., Hughes, G., Sadda, S., . . . Gao, Y. (2013). Genome-wide methylation profiles reveal quantitative views of human aging rates. Molecular cell, 49(2), 359-367. Hoerl, A. E., Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. Horvath, S., Raj, K. (2018). DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nature Reviews Genetics, 19(6), 371-384. Hu, L.-Y., Huang, M.-W., Ke, S.-W., Tsai, C.-F. (2016). The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus, 5(1), 1-9. Hughes, R. A., White, I. R., Seaman, S. R., Carpenter, J. R., Tilling, K., Sterne, J. A. (2014). Joint modelling rationale for chained equations. BMC medical research methodology, 14(1), 1-10. Jjingo, D., Conley, A. B., Soojin, V. Y., Lunyak, V. V., Jordan, I. K. (2012). On the presence and role of human gene-body DNA methylation. Oncotarget, 3(4), 462. Joenssen, D. W., Bankhofer, U. (2012). Hot deck methods for imputing missing data. Paper presented at the International workshop on machine learning and data mining in pattern recognition. Keerin, P., Kurutach, W., Boongoen, T. (2012). Cluster-based KNN missing value imputation for DNA microarray data. Paper presented at the 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Lena, P. D., Sala, C., Prodi, A., Nardini, C. (2020). Methylation data imputation performances under different representations and missingness patterns. BMC Bioinformatics, 21(1), 268. doi:10.1186/s12859-020-03592-5 Levine, M. E., Lu, A. T., Quach, A., Chen, B. H., Assimes, T. L., Bandinelli, S., . . . Li, Y. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY), 10(4), 573. Li, G., Raffield, L., Logue, M., Miller, M. W., Santos Jr, H. P., O’Shea, T. M., . . . Li, Y. (2021). CUE: CpG impUtation ensemble for DNA methylation levels across the human methylation450 (HM450) and EPIC (HM850) BeadChip platforms. Epigenetics, 16(8), 851-861. Lin, W.-Y. (2021). Lifestyle factors and genetic variants on two biological age measures: evidence from 94,443 Taiwan Biobank participants. The Journals of Gerontology: Series A. Lin, W.-Y., Lin, Y.-S., Chan, C.-C., Liu, Y.-L., Tsai, S.-J., Kuo, P.-H. (2020). Using genetic risk score approaches to infer whether an environmental factor attenuates or exacerbates the adverse influence of a candidate gene. Frontiers in genetics, 11, 331. Lin, W.-Y., Liu, Y.-L., Yang, A. C., Tsai, S.-J., Kuo, P.-H. (2020). Active cigarette smoking is associated with an exacerbation of genetic susceptibility to diabetes. Diabetes, 69(12), 2819-2829. Little, R. J., Rubin, D. B. (2019). Statistical analysis with missing data (Vol. 793): John Wiley Sons. Liu, J. Z., Mcrae, A. F., Nyholt, D. R., Medland, S. E., Wray, N. R., Brown, K. M., . . . Martin, N. G. (2010). A versatile gene-based test for genome-wide association studies. The American Journal of Human Genetics, 87(1), 139-145. Logue, M. W., Smith, A. K., Wolf, E. J., Maniates, H., Stone, A., Schichman, S. A., . . . Miller, M. W. (2017). The correlation of methylation levels measured using Illumina 450K and EPIC BeadChips in blood samples. Epigenomics, 9(11), 1363-1371. Meesad, P., Hengpraprohm, K. (2008). Combination of knn-based feature selection and knnbased missing-value imputation of microarray data. Paper presented at the 2008 3rd International Conference on Innovative Computing Information and Control. Palmer, C. J., Bruckner, R. J., Paulo, J. A., Kazak, L., Long, J. Z., Mina, A. I., . . . Hong, S. (2017). Cdkal1, a type 2 diabetes susceptibility gene, regulates mitochondrial function in adipose tissue. Molecular metabolism, 6(10), 1212-1225. Poage, G. M., Houseman, E. A., Christensen, B. C., Butler, R. A., Avissar-Whiting, M., McClean, M. D., . . . Kelsey, K. T. (2011). Global hypomethylation identifies loci targeted for hypermethylation in head and neck cancer. Clinical cancer research, 17(11), 3579-3589. Raghunathan, T. E., Solenberger, P. W., Van Hoewyk, J. (2002). IVEware: Imputation and variance estimation software. Ann Arbor, MI: Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan. Rao, P., Wang, H., Fang, H., Gao, Q., Zhang, J., Song, M., . . . Wang, W. (2016). Association between IGF2BP2 polymorphisms and type 2 diabetes mellitus: a case–control study and meta-analysis. International journal of environmental research and public health, 13(6), 574. Rauch, T. A., Zhong, X., Wu, X., Wang, M., Kernstine, K. H., Wang, Z., . . . Pfeifer, G. P. (2008). High-resolution mapping of DNA hypermethylation and hypomethylation in lung cancer. Proceedings of the National Academy of Sciences, 105(1), 252-257. Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys (Vol. 81): John Wiley Sons. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., . . . Altman, R. B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, 17(6), 520-525. Van Buuren, S. (2018). Flexible imputation of missing data: CRC press. Weber, M., Hellmann, I., Stadler, M. B., Ramos, L., Pääbo, S., Rebhan, M., Schübeler, D. (2007). Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nature genetics, 39(4), 457-466. Willmott, C. J., Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research, 30(1), 79-82. Yang, X., Han, H., De Carvalho, D. D., Lay, F. D., Jones, P. A., Liang, G. (2014). Gene body methylation can alter gene expression and is a therapeutic target in cancer. Cancer cell, 26(4), 577-590. Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301-320. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82115 | - |
| dc.description.abstract | DNA 甲基化是表觀遺傳學中十分重要的生物標記,且已被許多研究證實 DNA 甲基化與人體生物功能如老化、癌症、過敏及糖尿病等具有高相關性。然而透過甲基化分析晶片收集得的甲基化數據卻可能因為包含許多遺失值而增加後續甲基化資料分析之困難度,因此甲基化研究必須經過插補方法得出替換值進行資料插補的動作。本研究使用了三類插補方法分別為單位插補法、K-近鄰演算插補法及鏈式方程多重插補法,考慮了方法中許多參數組合並且加入了位點與位點之間的相關程度進入本研究的方法中。 於實際資料中本研究使用臺灣人體生物資料庫全基因體甲基化晶片資料的 2091 位參與者,根據不同的缺失機制製作出相對應得模擬資料集,並且針對不同的模擬資料集比對出三類插補方法中最為適合的插補方法。最終本研究測定出的結果在遺失比例小的完全隨機缺失機制及隨機缺失機制下,使用均方根誤差作為最終評價指標時使用 K-近鄰演算插補法並且考慮甲基化位點之間相關性可以得到最好的預測插補值,而使用平均絕對誤差作為最終評價指標則使用鏈式方程多重插補法可以得到最好的預測插補值。但是遺失比例較大的資料集則本研究會建議不論最終評價指標為何,使用 K-近鄰演算插補法均可以得到最好的插補結果。 本研究之結果指出,評價指標、遺失比例及缺失機制等要素均會影響插補結果的好壞,並且加入甲基化位點之間相關性的插補方法可以顯著的減少插補誤差。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-25T05:36:10Z (GMT). No. of bitstreams: 1 U0001-2010202115270900.pdf: 2402441 bytes, checksum: a4294ea0c9842a7f67c755bed9342c44 (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | 致謝 i 中文摘要 ii Abstract iii 目錄 v 圖目錄 vi 表目錄 vii 第一章 前言 1 第二章 研究材料 4 臺灣人體生物資料庫 4 甲基化量化指標 4 數據缺失機制 5 資料處理 6 第三章 研究方法 7 單位插補法 7 K-近鄰演算插補法 7 鏈式方程多重插補法 9 第四章 模擬與結果 13 評價指標 13 資料模擬 14 完全隨機缺失機制插補結果 14 第五章 結論與討論 16 附錄 18 參考文獻 29 | |
| dc.language.iso | zh-TW | |
| dc.subject | 基因甲基化 | zh_TW |
| dc.subject | 臺灣人體生物資料庫 | zh_TW |
| dc.subject | 資料插補 | zh_TW |
| dc.subject | K-近鄰演算插補法 | zh_TW |
| dc.subject | 鏈式方程多重插補法 | zh_TW |
| dc.subject | Multiple Imputation by Chained Equations | en |
| dc.subject | Imputation | en |
| dc.subject | DNA methylation | en |
| dc.subject | Taiwan Biobank | en |
| dc.subject | K Nearest Neighbors impute algorithm | en |
| dc.title | DNA 甲基化插補方法之比較研究 | zh_TW |
| dc.title | A comparative study on DNA methylation imputation methods | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.advisor-orcid | 林菀俞(0000-0002-3385-4702) | |
| dc.contributor.oralexamcommittee | 蕭朱杏(Hsin-Tsai Liu),洪弘(Chih-Yang Tseng) | |
| dc.subject.keyword | 資料插補,基因甲基化,臺灣人體生物資料庫,K-近鄰演算插補法,鏈式方程多重插補法, | zh_TW |
| dc.subject.keyword | Imputation,DNA methylation,Taiwan Biobank,K Nearest Neighbors impute algorithm,Multiple Imputation by Chained Equations, | en |
| dc.relation.page | 33 | |
| dc.identifier.doi | 10.6342/NTU202103931 | |
| dc.rights.note | 同意授權(限校園內公開) | |
| dc.date.accepted | 2021-10-22 | |
| dc.contributor.author-college | 公共衛生學院 | zh_TW |
| dc.contributor.author-dept | 流行病學與預防醫學研究所 | zh_TW |
| dc.date.embargo-lift | 2024-09-01 | - |
| 顯示於系所單位: | 流行病學與預防醫學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2010202115270900.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 2.35 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
