請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88142
標題: | 以機器學習方法預測人類粒線體核糖體DNA變異之致病性 Using machine learning methods to predict the pathogenicity of human mitochondrial ribosomal DNA mutations |
作者: | 鄭佳芳 Chia-Fang Cheng |
指導教授: | 賴飛羆 Fei-Pei Lai |
關鍵字: | 人類粒線體核糖體DNA,機器學習,極限梯度提升,特徵整合,致病預測, Human mitochondrial ribosomal DNA,machine learning,XGB,feature integration,pathogenicity predictor, |
出版年 : | 2023 |
學位: | 碩士 |
摘要: | 本研究提出了一項對於預測人類粒線體核糖體DNA變異致病性的綜合分析。我們提出了一種基於機器學習極限梯度提升加上特徵整合的新方法,該方法集成了多個因素,包括同質性、異質性、等位基因頻率、異質性程度、變異導致的良性或致病性變化率,通過核苷酸突變熵計算的核苷酸突變的可變性與複雜性,以及核苷酸突變導致的序列信息改變(例如結構變化、酮基氨基存在等),並通過SHAP找出模型預測致病性所判定的特徵重要度。目前尚未發表任何針對人類粒線體核糖體DNA的預測方法,我們的方法是第一個且在評估數據集上取得了0.9886的F1分數。通過利用機器學習的力量並考慮粒線體核糖體DNA的獨特特徵,我們的方法為準確預測粒線體核糖體DNA變異的致病性提供了一個有價值的工具。 This study proposes a comprehensive analysis for predicting the pathogenicity of human mitochondrial ribosomal DNA (mt-rDNA) variations. We introduce a novel approach based on XGB model with feature integration, which integrates multiple factors including homogeneity, heterogeneity, allele frequency, heteroplasmy level, variation-induced benign or pathogenic rate of change, variability and complexity of nucleotide mutations calculated through nucleotide mutation entropy, and sequence information alterations caused by nucleotide mutations (such as structural changes and presence of keto-amino bases). Additionally, we utilize SHAP (Shapley Additive Explanations) to identify feature importance in determining the pathogenicity predicted by the model. Currently, no prediction methods specifically targeting human mt-rDNA variations have been published, and XGB with feature integration is the first to achieve an F1 score of 0.9886 on the evaluation dataset. By harnessing the power of machine learning and considering the unique characteristics of mt-rDNA, our approach provides a valuable tool for accurately predicting the pathogenicity of mitochondrial ribosomal DNA variations. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88142 |
DOI: | 10.6342/NTU202301584 |
全文授權: | 未授權 |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 目前未授權公開取用 | 3.95 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。