請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90064
標題: | 以蛋白質動力輔助圖神經網路預測蛋白質熱穩定性 Protein Thermostability Prediction by Graph Neural Network with Dynamics-Informed Graph Representation of Proteins |
作者: | 陳諺霖 Yen-Lin Chen |
指導教授: | 張書瑋 Shu-Wei Chang |
關鍵字: | 蛋白質,熱穩定性,融點溫度,簡正模態,圖神經網路,深度學習,回歸激發圖, proteins,thermostability,melting temperature,normal mode analysis,graph neural networks,deep learning,regression activation map, |
出版年 : | 2023 |
學位: | 碩士 |
摘要: | 蛋白質之熱穩定性為蛋白質在極端溫度中維持可運作構型 (functional conformation) 的能力,並可被折疊態 (folded state) 與非折疊態 (unfolded state) 之間的雙態轉變 (two-state transition) 所量化。蛋白質於其原生環境溫度中大多呈折疊態,隨著系統升溫,非折疊態之蛋白質數量增加,此動態變動的過程中,當兩構型以相同數量存在的溫度即為蛋白質的熔點溫度 (melting temperature) ,為蛋白質熱穩定性的重要指標。由於工業上時常需要將酵素置於非原生之高溫環境中,使得熔點溫度成為蛋白質工業適用性的重要指標之一,在設計或篩選工業酵素時為一大考量。
先前研究已證實將蛋白質之結構 (structure) 與動力 (dynamics) 特徵加譯為圖 (graphs) 並利用圖神經網路 (graph neural networks, GNN) 預測蛋白質功能的可行性。於此,本研究串聯蛋白質熱穩定性、蛋白質功能性、蛋白質結構動力資訊,展示如何將蛋白質的結構與動態資訊用於蛋白熔點溫度之預測。為了泛用於尚未解出實驗結構的蛋白質,本研究採用AlphaFold的預測結果作為蛋白質的結構,再以此結構為基礎,建立扭矩網路模型 (torsional network model, TNM) ,並根據此力學模型獲得其簡正模態 (normal mode) ,提供後續蛋白質動力耦合 (dynamic coupling) 計算。最終,蛋白質結構將以接觸圖 (contact graph) 和PAE圖 (predicted aligned error graph) 表示,蛋白質動態資訊則以共向圖 (co-directionality graph) 、協調圖 (coordination graph) ,以及變位圖 (deformation graph) 表示。結果顯示,將蛋白質經過以上處理加譯為圖,搭配圖神經網路預測熔點溫度,與實驗量測結果比較,平均絕對誤差為3.291°C,方均根差為4.286°C,而R^2可達0.805。本研究亦利用影像辨識中特徵視覺化的技術,可反向檢視蛋白質中哪些殘基 (residues) 對於模型的預測有較高的影響力,亦可辨別資料中各種圖之於模型預測的重要程度。這些資訊再次指出蛋白動態對於熱穩定性的重要性,並提供了改善熱穩定性的可能關鍵區域,可作為提升蛋白質工業應用表現之參考,亦為未來研究提供指引。 Protein thermostability, the resistance or preservation of protein functions under extreme temperatures, plays a vital role in numerous biotechnological applications. Since designed proteins, such as industrial enzymes and biocatalysts, are often subjected to temperatures that significantly differ from the cellular environment, protein thermostability has always been critical to consider when making protein designs or searching for proteins suitable for a specific task. Commonly simplified as a two-state transition, protein thermostability is primarily characterized by the melting temperature, where the folded and unfolded states are equally favorable. This work focuses on the prediction of the melting temperature of protein. As protein dynamics is essential in understanding protein functions, an effective data representation that includes the dynamics should benefit the melting temperature prediction. In this work, a graph-based (as in graph theory) representation of proteins that encompasses the protein sequence, structure, and dynamics is presented. A graph neural network architecture that uses message passing layers was designed to accommodate multiple types of connections. Protein structures were computed by AlphaFold, and the dynamics were computed based on the torsional network model (TNM) for training. Hence, the learned features and parameters can be readily applied to protein sequences without known experimental structure, satisfying the goal of aiding the prediction of design proteins. Critical regions that strongly influence the thermostability of proteins are identified by computing a graph regression activation map (RAM), which is based on the partial derivative of the predicted value with respect to the convolutional features map. The method provides an efficient approach to accessing the thermostability of new protein sequences. Further, it provides insights into the inner workings of proteins by identifying residues critical to thermostability. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90064 |
DOI: | 10.6342/NTU202303848 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 土木工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf | 1.8 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。