請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88002
標題: | 蛋白質結構預測模型之比較分析及量子啟發式計算於天然物結構解析問題之探索 Comparative Analysis of Protein Modelling Methods and Exploration of Quantum-Inspired Computing for Natural Product Structure Elucidation |
作者: | 李謙 Chien Lee |
指導教授: | 曾宇鳳 Yufeng Jane Tseng |
關鍵字: | 蛋白質,結構預測,G蛋白偶合受體,量子計算,數位退火,天然物,結構解析, Protein Modeling,AlphaFold,RoseTTAFold,Modeller,G-Protein Coupled Receptors,Quantum Computing,Digital Annealing,Natural Product,Structure Elucidation, |
出版年 : | 2023 |
學位: | 碩士 |
摘要: | 此篇論文分為二獨立之篇章。
第一章為蛋白質結構預測模型之比較分析。近幾年,基於神經網路之蛋白質結構預測模型如AlphaFold 及RoseTTAFold 在預測準確度上有了十足的進步,然而這些模型於特定蛋白質家族的結構預測成效尚未被完善檢驗。而在蛋白質家族集合之中,又屬G蛋白偶聯受體蛋白(G Protein-Coupled Receptors, GPCRs)因參與許多重要生物路徑(biological pathway)最為受到矚目。我們在第一章的研究中,評測及比較神經網路模型AlphaFold 及RoseTTAFold,以及基於模板之結構預測模型Modeller 於 G 蛋白偶聯受體蛋白之結構預測表現。我們在蛋白質資料庫(Protein Data Bank)上蒐集了73 個G 蛋白偶聯受體蛋白的實驗結構資料,並輸入胺基酸序列資料至該三種預測模型進行結構預測,而後使用均方根偏差(root mean square deviation, RMSD)做為評量之依據。我們發現Modeller 在統計上因為使用的模板相當接近被預測之蛋白質而有著最高的準確度。然而,對於未有相似模板之目標蛋白質,AlphaFold 及RoseTTAFold 在平均上有著更好的表現。該二神經網路模型有著較高的均方根偏差的原因主要是於非二級結構之片段有著較差的預測表現。整體來說,我們的研究顯示神經網路模型在沒有高品質模板的情況下有著很好的應用潛力。 第二章則是在探討量子啟發式計算應用於天然物化學結構解析問題之可行性。天然物之結構解析問題對於藥物開發有相當之重要性,然而解析過程相當耗時且相關資料少。在第二章的研究中,我們設計了應用數位退火(digital annealing)技術於天然物結構解析問題之計算架構。我們承接了前人研究之問題設計,並將之改造成適合數位退火技術之二次無限制二元最佳化(quadratic unconstrained binary optimization, QUBO)問題形式。在實驗結果中,我們發現富士通開發之數位退火器能夠正確尋找到對應至正確結構之問題解。雖然對於較為複雜之結構所產生出的問題,數位退火器需要更多時間進行搜尋最佳解,但數位退火應用於此類問題還是有相當潛力。我們的研究成果也提供了未來量子計算應用於計算化學一條可行的道路。 This thesis is divided into two distinct chapters, each focusing on a separate topic. The first chapter provides a study on the evaluation of neural network-based protein modeling methods. In recent years, neural network-based protein modeling methods, such as AlphaFold and RoseTTAFold, have shown significant improvement in overall accuracy. However, the performance of these non-homology-based modeling methods for specific protein families has not been thoroughly examined. G-protein-coupled receptor (GPCR) proteins are of particular interest due to their involvement in numerous pathways. The first study directly compares the performance of AlphaFold and RoseTTAFold with Modeller, the most widely used template-based software, for GPCRs. We collected experimentally determined structures of 73 GPCRs from the Protein Data Bank and used the official AlphaFold repository and RoseTTAFold web service with default settings to predict five structures of each protein sequence. The predicted models were aligned with the experimentally solved structures and evaluated using the root-mean-square deviation (RMSD) metric. When only considering each program's top-scored structure, Modeller had the smallest average modeling RMSD of 2.17 Å, likely because it already included many known structures as templates. However, in cases where no good templates were available for Modeller, the neural network-based methods outperformed it, with AlphaFold and RoseTTAFold generating better models in 21 and 15 out of the 73 cases with the top-scored model, respectively. The larger RMSD values generated by the neural network-based methods were primarily due to differences in loop prediction compared to the crystal structures. These findings suggest that neural network-based protein modeling methods have great potential in specific cases where template-based methods may not be suitable. The second chapter of this thesis explores the potential applications of quantum computing or quantum-inspired computing for natural product structure elucidation. Determining the structure of natural products is a critical issue in medicinal chemistry and chemical biology as it can inspire the development of new drugs. Unfortunately, the identification process for natural products is often time-consuming and limited by a lack of data on known natural products. In the second study, we present a novel approach to address the structural elucidation problem by developing a computer-aided structural elucidation (CASE) algorithm that utilizes quantum-inspired digital annealing technology. The computational problem formulation and a dynamic programming-based algorithm are provided in the previous works. In this study, we further developed the algorithm by transforming the problem formulation into a quadratic unconstrained binary optimization (QUBO) model that can be input to the FUJITSU digital annealer (DA), which is designed to efficiently solve combinatorial optimization problems. The results demonstrate that DA was able to find all the correct structures of the natural product of Ophiopogon japonicus. Although it requires more time for the digital annealing routine to acquire correct outcomes for complicated structures, DA demonstrated the potential of solving combinatorial problems with a novel computing framework. Our work represents a promising step toward using quantum computing to solve one of the major problems in computational chemistry. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88002 |
DOI: | 10.6342/NTU202300843 |
全文授權: | 同意授權(限校園內公開) |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 2.81 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。