新藥設計及化學結構組合最佳化與心毒性模擬

Bo-Han Su; 蘇柏翰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64697

標題:	新藥設計及化學結構組合最佳化與心毒性模擬 A Chemical Substituents-Core Combinatorial Optimization and hERG Toxicity Prediction in Drug Design
作者:	Bo-Han Su 蘇柏翰
指導教授:	曾宇鳳
關鍵字:	人類鉀離子通道基因,二元定量構效關係模型,支援向量機,模版為基礎的,新藥設計方法,第二型激化酵素抑制劑,非完全多項式,動態規劃,天然物, hERG,Binary QSAR model,SVM,Templated–based,de novo design,Type II kinase inhibitors,NP-complete,Dynamic programming,Natural products,
出版年 :	2012
學位:	博士
摘要:	為了有效加速傳統藥物設計的發展時程，本篇論文分別針對電腦輔助藥物設計中最重要的先導化合物的辨識階段、結構最佳化階段與吸收、分配、代謝、排泄和毒性(ADMET)之測試階段各提出了一項研究主題。在ADMET的測試階段，我們針對人類鉀離子通道基因(hERG)毒性試驗資料發展一系列的電腦分類模型，以進行毒理(T)之預測。為了針對三種難以處理又具不同型態的hERG毒性資料進行預測，我們使用了二元定量構效關係(QSAR)模型與支援向量機(SVM)兩種策略來進行系統之建置。三種訓練集合資料包括：從文獻上搜集來的傳統劑量反應活性資料集合、具有”活性-非活性”之資料分佈高度不均的高通量篩選(high-throughput screening)數據集合、以及文獻上難以處理的hERG試驗資料集合。針對傳統劑量反應活性資料集合，我們以連續性部分最小平方(partial least squares)QSAR模型為基底轉換成二元QSAR模型之策略來增加預測hERG毒性之準確性。在第二個hERG的研究中，除了利用SVM來進行系統建置外，提出了各種合理的策略來有效降低在訓練集合中，活性與非活性資料之間分佈高度不均的情形。在最後一項hERG的研究中，特別以一組難以處理的資料集合為範例，提出一套可用來建立良好的QSAR預測模型之標準程序。因為在前人研究中皆難以利用此資料來建立較佳的預測模組，為了展示如何針對此集合來建構出一套良好的QSAR預測模型，我們延用前兩項提出的方法來進行發展與討論。以上三個所建構的QSAR模組之預測準確性皆比起過去曾發表的系統還高出許多。此外，系統所辨識出的重要特徵可以直接描繪在三維分子結構上，作為是否會增加或減少hERG毒性之依據。在藥物研發的過程中，我們所發展的系統可以提供化學結構上該如何修飾之資訊，以避免合成出具有心臟毒性的藥物。在結構最佳化階段，我們針對第二型激化酵素抑制劑(Type II kinase inhibitors)，發展了一套新的以模板為基底的新藥設計方法。將已知具有潛力的抑制劑作為模版來設計出可保有原結合交互作用力之新抑制劑。首先，以兩個已上市之藥物(Sorafenib與Nilotinib)作為模版，發展出共有五個步驟的協定可以從大量的片段結構資料庫重組出每一種藥物。在我們的系統測試下，所選擇的兩種上市藥物可經由我們的演算法成功地被重組出來。此外，為了能顯示本系統能夠找出更具潛力的結構，我們從一系列的氨基異喹啉(aminoisoquinoline)衍生物中挑出一個具有較低活性的化合物作為模版，以辨識出此系列衍生物中其它具較有高活性的結構。在先導化合物的辨識階段，為了解決一個非完全多項式(NP complete)我們稱之為化學結構組合的最佳化問題，可有效針對天然產物進行結構辦識。我們設計出一個動態規劃(dynamic programming)演算法，並以迭代動態規劃演算法進行加速，其時間複雜度在平均情況下可在多項式時間內完成。為了驗證演算法在真實情況下的系統效能和確認是否能有效地找出最佳結構，套用了四種不同的天然物作為測試。本演算法的實際執行時間遠小於暴力演算法與前人以分支界定(branch and bound)所發表之演算法。此外，相較於前人所設計的分支界定演算法可能會有無法找到最佳解的情況，本演算法針對這四種測試資料，皆可在合理的時間內找到最佳解。 To speed up the development of traditional drug design process, three works for lead discovery, lead optimization, and absorption, distribution, metabolism, excretion, and toxicity (ADMET) stage that computer-aided drug discovery can help in a drug discovery and development process were presented in this thesis. In the stage of ADMET, predictive toxicology (T) modeling was performed to develop in silico classification models on hERG toxicity endpoints. Both binary QSAR and SVM strategies against three different typical and yet difficult hERG toxicity were performed. Three training datasets are composed of a traditional dose response study collected from literatures, a high-throughput screening assay with highly imbalanced active-inactive data distribution, and notoriously difficult hERG endpoints dataset claimed to be impossoble to model by other researchers. A binary QSAR model transformed from the continuous partial least squares (PLS) QSAR model to enhance the predictive reliability of hERG blockage was developed for traditional dose response studies. In second hERG study, SVM and training set compounds pruning from the proposed protocol to address the typical imbalanced active-versus-inactive data in high throughput screening assay were discussed and performed. In the last part of hERG studies, to demonstrate how to establish a good QSAR model for large datasets that are chemically diverse and highly imbalanced dataset (notoriously difficult endpoints), the previous two strategies were performed to construct, analyze, and discuss QSAR models for the third hERG dataset. The accuracies of three QSAR models discussed herein are all markedly better than other published classification models. In addition, the key descriptors that can be portrayed as 3D molecular features contributing to both increasing and decreasing hERG potency were identified and discussed. The models can benefit chemical structures designs in drug discovery process to avoid synthesizing cardio toxic compounds originated from hERG channel protein. In lead optimization, a novel template-based de novo design algorithm and protocol for type II kinase inhibitors was reported. Well-known potent inhibitors were used as a template to discover new inhibitors that preserve the binding interaction. First, Sorafenib (NexavarR) and Nilotinib (TasignaR) were selected as the template compounds. The proposed five-step protocol can reassemble each drug from a large fragment library. Our procedure demonstrates that the selected template compounds can be successfully reassembled while the key ligand-receptor interactions are preserved. Furthermore, to demonstrate that the algorithm is able to construct more potent compounds, the de novo optimization was initiated using a template compound possessing a less than optimal activity from a series of aminoisoquinoline derivatives. In lead discovery, to effectively identify the chemical structures of natural products, we studied a NP-complete problem, Chemical Substituents-Core Combinatorial Problem. We developed a classical dynamic programming algorithm and designed another iterative dynamic programming algorithm to improve time performance that can be executed in polynomial time in average case. Four natural products cases that were experimentally measured and structurally determined are applied in this study to verify the time performance of this algorithm. The execution time of this algorithm outperforms the previous published branch and bound strategy and brute force algorithm. Furthermore, it can be used to search for the real optimal solutions in an acceptable time for the four cases compared to the lengthy and un-determined search in the previous branch bound strategy.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64697
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf 目前未授權公開取用	3.11 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。