請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67491
標題: | 結合以結構為基礎的功能區域預測方法預測基因體中蛋白質轉譯區段的有害變異 Incorporating structure-based functional site prediction in predicting deleterious protein coding region variation in human genome. |
作者: | Pei-Hsuan Chen 陳佩萱 |
指導教授: | 楊安綏(An-Suei Yang) |
關鍵字: | 註解變異,胺基酸取代,蛋白質變異,有害變異,機器學習,支持向量機, annotating variation,amino acid substitution,protein variation,deleterious variation,machine learning,support vector machine, |
出版年 : | 2017 |
學位: | 碩士 |
摘要: | 隨著高通量技術的發展,以及各個不同定序計畫產生的序列變異的數量逐漸增加,如何應用電腦計算方法來協助解釋這些序列變異,成為大家所關注的研究議題。在現存的方法中,大多利用以序列或結構為基礎的資訊來檢測這些序列變異的影響,並且他們試著解釋這些變異在蛋白質功能的破壞或者疾病致病性上的影響是什麼。在本篇研究中,我們整合以序列為基礎的資訊以及ISMBLab功能性區域預測方法,來辨識會破壞蛋白質功能的有害的胺基酸取代。我們從VIPUR預測工具的訓練集中,蒐集8,884個蛋白質變異來建構出一個SVM的分類器,而這些蛋白質變異皆是已經有明確的實驗上證明它是否會破壞蛋白質功能的變異。從結果中可以得知,我們的分類器能夠可以推展運用至其他物種預測上,且在ROC及PR曲線下的面積皆能得到更好的數值。若和其他方法做比較的話,對於人類變異的測試資料集,我們的分類器可以得到0.405的Matthews相關係數。總結,我們提出一個整合以結構為基礎的功能區域預測方法,可以來預測胺基酸取代對於蛋白質功能的影響,另外也能夠證明衍生自ISMBLab功能區域預測的特徵值對於蛋白質變異的預測是有幫助的。 As high-throughput techniques advance and massive sequence variation data is generated by different sequencing projects, the application of computational methods to annotate these variations tends to be an issue of concern. Existing methods exploit sequence-based or structure-based information to interpret the effects of variations and most of them correlate the effects with the functional disruption of a protein or the disease pathogenicity. Here we present a method that integrates sequence-based information and ISMBLab functional site prediction to identify the deleterious amino acid substitutions which disrupt the functions of proteins. In this work, we collect 8,884 protein variants from VIPUR training set, which have clear experimental evidences on the disruptions of protein functions, to train a SVM classifier. The results show that our classifier can generalize to other organism with better values of the area under ROC and PR curves. Compare to other methods, the Matthews correlation coefficients for human variants testing set is 0.405. In summary, we provide an incorporating structure-based functional site prediction method to predict the effects of amino acid substitutions on protein functions, and prove that features derived from ISMBLab functional site prediction are useful for predicting protein variations. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67491 |
DOI: | 10.6342/NTU201702281 |
全文授權: | 有償授權 |
顯示於系所單位: | 基因體與系統生物學學位學程 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-106-1.pdf 目前未授權公開取用 | 2.51 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。