請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96777| 標題: | 可解釋性的圖神經網路之於預測 P-醣蛋白受體 Interpretable graph neural networks for predicting P-glycoprotein substrates |
| 作者: | 許洸誠 Kuang-Cheng Hsu |
| 指導教授: | 曾宇鳳 Yufeng Jane Tseng |
| 關鍵字: | P-糖蛋白,電腦輔助藥物設計,圖神經網路,可解釋性 AI,注意力機制,整合梯度,深度學習, P-glycoprotein,Computer-aided Drug Design,Graph Neural Network,Explainable AI,Attention Mechanism,Integrated Gradient,Deep Learning, |
| 出版年 : | 2024 |
| 學位: | 碩士 |
| 摘要: | P-糖蛋白(P-gp),是ABC轉運蛋白家族的一員,存在於細胞膜上。它與各種外來物質結合,並主動將它們從細胞中排出,從而減少細胞對它們的吸收。與P-gp相互作用並隨後被排出的這些物質稱為P-gp受體。鑑於其在全身範圍內的廣泛分佈,包括血腦屏障,了解藥物是否屬於P-gp受體以及其穿越血腦屏障的能力對於藥物開發至關重要。
在我們的研究中,我們開發了一套強大的方法,利用各種圖神經網路(GNN)模型在預測P-gp受體上實現卓越的準確性,同時保持可解釋性。具體來說,我們探索了三種GNN架構:圖卷積神經網路(GCN)、AttentiveFP和基於AttentiveFP的集成模型。我們的重點是預測給定的藥物分子是否為P-gp受體。我們整理了一個包含1995個藥物分子的資料集,其中包括1202個P-gp受體和793個P-gp非受體,以9:1的比例分為訓練集和測試集。我們的方法優於傳統的機器學習模型,達到了優秀的0.848的ROC-AUC和0.815的準確度。利用積分梯度方法,我們提取了與P-gp受體相關的6個關鍵子結構,這與現有文獻發現一致。此外,我們的方法不僅提供出色的預測結果,還為藥物開發人員提供透明的見解,使其成為P-gp受體預測之外的藥物開發和優化的寶貴工具。 P-glycoprotein (P-gp), a member of the ABC transporter family, is located on cell membranes; it binds to various foreign substances and actively transports them out of cells, thereby reducing their absorption. These substances that interact with P-gp and are subsequently expelled are termed P-gp substrates. Given the extensive distribution of P-gp throughout the body, including in the blood‒brain barrier, determining whether a drug is a P-gp substrate and understanding its ability to cross the blood‒brain barrier are crucial steps in drug development. In our study, we developed a robust protocol leveraging various graph neural network (GNN) models to accurately predict P-gp substrates while maintaining interpretability. Specifically, we explored three GNN architectures: the graph convolutional neural network (GCN), AttentiveFP, and an ensemble model based on AttentiveFP. We focused on predicting whether a given drug molecule is a P-gp substrate. We curated a dataset comprising 1995 drug molecules, including 1202 P-gp substrates and 793 P-gp non-substrates, which was split into a training set and a testing set with a 9:1 ratio. Our approach outperformed traditional machine learning models, achieving an impressive receiver operating characteristic (ROC)-area under the curve (AUC) of 0.848 and an accuracy of 0.815. Using the integrated gradient method, we identified 6 critical substructures associated with P-gp substrates, consistent with previous studies findings. Our protocol achieved outstanding prediction results and can provide transparent insights for drug developers, making it a valuable tool for drug development and optimization beyond P-gp substrate prediction. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96777 |
| DOI: | 10.6342/NTU202500022 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 未授權公開取用 | 2.03 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
