請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65830
標題: | 區域結構碼於改良蛋白質摺疊辨識法之結構比對研究-以貝氏推論架構為基礎 Bayesian-Inferred Local Structural Alphabets for Improving Protein Structure Alignment of Fold Recognition |
作者: | Kenneth Hung 洪基展 |
指導教授: | 陳中明(Chung-Ming Chen) |
關鍵字: | 結構預測,摺疊辨識法,結構字元,貝氏定理,結構比對,區域結構,幾何不變量, structure prediction,fold recognition,structural alphabet,Bayes’ theorem,structure alignment,local structure,geometric invariant, |
出版年 : | 2012 |
學位: | 博士 |
摘要: | 摺疊辨識法(Fold recognition)是現行蛋白質結構預測方法中,運用較為廣
泛之方法之一,它是以現存之立體結構資訊為基礎來快速建構模型,其中決定模型品質好壞之關鍵,在於標的蛋白質(target protein)與已知結構資訊之結構模板(template structures)間之比對能力。在以上述結構模板為基礎之蛋白質結構預測(template-based protein structure prediction)方法中,最關鍵之步驟是要透過結構比對方式,來找出蛋白質間相似之三級結構,因其影響到整個方法之預測能力。摺疊辨識法之結構比對步驟,往往是利用特定之蛋白質結構特徵(structural features),來找出最佳比對結果。然而,當兩個蛋白質之胺基酸序列相似度低於30%,且兩序列擁有相似三級結構之狀況下,對標的蛋白質與已知結構資訊之結構模板間進行結構比對之困難度,就會隨之提高。此時,若能適當地結合胺基酸物化特性、序列以及結構資訊,來定義區段之摺疊方式,最後並組成一個完整之蛋白質結構,將可提升該類方法之預測品質。 因此本研究提出以貝氏架構(Bayesian framework)為基礎,來進行區域結構碼(structural alphabets)預測,即透過對未知結構資訊之標的蛋白質片段進行編碼,將蛋白質立體結構資訊,轉換成帶有結構資訊之一維蛋白質結構碼序列,最後再以結構比對的方式進行效能評估,我們稱之為MIRAGE-Bayesian alignment。在本研究中,整合區域結構特徵之結構編碼式比對法(structural alphabet-based alignment),已被證實可用以提升結構比對品質,以及演算法之計算效能。這些被用來提升比對品質,以及計算效能之蛋白質空間結構特徵,如「蛋白質骨幹中兩個平面夾角(dihedral torsion angles)」、「幾何不變量(geometric invariants)」以及「 Cα原子間距離(distance between Cα atoms)」等,其性質都和分子構形息息相關。而根據本研究演算法所獲得之結果,證實該演算法確能夠成功地提升摺疊辨識法之結構比對品質以及計算效能。 在本研究中,我們使用了三組作為演算法效能評估之測試資料,其中包含了費雪(Fischer)之68筆(蛋白質序列相似度介於8%至31%)蛋白質對,以及另外兩組資料量較大之蛋白質對(600筆及17,119筆),這些蛋白質對之序列相似度均低於30%。根據實驗結果,本研究所提出之演算法,不僅其比對品質,優於目前廣泛被用來結構比對之其它四種演算法(CE、SSM、TM-align以及 Fr-TM-align),且呈現出顯著之成果。因此我們可以推論,本研究所提出之演算法,不僅具有比對出蛋白質演化過程中,演化距離較遠之蛋白質對之良好能力,並相信此方法能進一步對蛋白質功能註解之研究作出貢獻。 Fold recognition is a popular protein structure prediction approach relying on a good quality alignment of the target and the template structures. The crucial step of template-based protein structure prediction approaches is to recognize proteins that have similar tertiary structures. The value of the fold recognition alignment step often is to exploit specific structural features that are considered to be important for selecting the optimal alignment. It becomes very challenging when the sequence identity is low between target and template proteins. The key to the success of template-based method lies in the proper incorporation of physiochemical, sequence, and structural information. A new idea featuring the Bayesian framework for encoding protein fragments of unknown structure in structural alphabets has been introduced to achieve a better fold recognition alignment, called MIRAGE-Bayesian alignment, in this study. The structural alphabet-based alignment has been developed on incorporating the target protein of unknown structural information with the local structural features for improving structural alignment quality and computational efficiency in this work. The spatial features, i.e., the dihedral torsion angles, the geometric invariants, and distance between Cα atoms, essentially determine the backbone conformation of proteins and are employed to improve the quality of structural alignment. The performance of the proposed algorithm was evaluated by performing a structure alignment based on the one-dimensional structural alphabet sequence containing information of local structural features of target and template protein sequence. The result shows that the proposed algorithm successfully demonstrated its ability to enhance the sequence-structure alignment quality, and computational efficiency of fold recognition. To assess the performance of the proposed algorithm, Fischer’s test set, comprised 68 protein pairs (sequence identity ranging 8% to 31%) and two other larger benchmark of 600 and 17,119 non-homologous protein pairs (sequence identity less than 30%) are employed to evaluate alignment quality and the computation efficiency. The result demonstrates that the alignment quality of the proposed algorithm outperforms the other four widely used algorithms, i.e., CE, SSM, TM-align, and Fr-TM-align. It is believed that the proposed algorithm has the potential to identify distantly related proteins and further help in the elucidation of protein function. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65830 |
全文授權: | 有償授權 |
顯示於系所單位: | 醫學工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-101-1.pdf 目前未授權公開取用 | 16.57 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。