Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 化學工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86958
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李奕霈zh_TW
dc.contributor.advisorYi-Pei Lien
dc.contributor.author楊筑宜zh_TW
dc.contributor.authorChu-I Yangen
dc.date.accessioned2023-05-02T17:04:27Z-
dc.date.available2023-11-09-
dc.date.copyright2023-05-02-
dc.date.issued2022-
dc.date.submitted2023-01-09-
dc.identifier.citation(1) Yang, K.; Swanson, K.; Jin, W.; Coley, C.; Eiden, P.; Gao, H.; Guzman-Perez, A.; Hopper, T.; Kelley, B.; Mathea, M.; Palmer, A.; Settels, V.; Jaakkola, T.; Jensen, K.; Barzilay, R. Analyzing Learned Molecular Representations for Property Prediction. J. Chem. Inf. Model. 2019, 59 (8), 3370–3388.
(2) Chithrananda, S.; Grand, G.; Ramsundar, B. ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. arXiv October 23, 2020.
(3) Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The Rise of Deep Learning in Drug Discovery | Elsevier Enhanced Reader. Drug Discovery Today 2018, 23 (6), 1241–1250.
(4) Angermueller, C.; Pärnamaa, T.; Parts, L.; Stegle, O. Deep Learning for Computational Biology. Molecular Systems Biology 2016, 12 (7), 878.
(5) Segler, M. H. S.; Waller, M. P. Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. Chemistry – A European Journal 2017, 23 (25), 5966–5971.
(6) Schreck, J. S.; Coley, C. W.; Bishop, K. J. M. Learning Retrosynthetic Planning through Simulated Experience. ACS Cent. Sci. 2019, 5 (6), 970–981.
(7) Meuwly, M. Machine Learning for Chemical Reactions. Chem. Rev. 2021, 121 (16), 10218–10239.
(8) Koh, P. W.; Liang, P. Understanding Black-Box Predictions via Influence Functions; PMLR, 2017; pp 1885–1894.
(9) Xu, F.; Uszkoreit, H.; Du, Y.; Fan, W.; Zhao, D.; Zhu, J. Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges. In Natural Language Processing and Chinese Computing; Tang, J., Kan, M.-Y., Zhao, D., Li, S., Zan, H., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, 2019; pp 563–574.
(10) Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2021, 23 (1), 18.
(11) Arrieta, A. B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; Chatila, R.; Herrera, F. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:1910.10045 [cs] 2019.
(12) Rodríguez-Pérez, R.; Bajorath, J. Explainable Machine Learning for Property Predictions in Compound Optimization: Miniperspective. J. Med. Chem. 2021, 64 (24), 17744–17752.
(13) Rao, J.; Zheng, S.; Yang, Y. Quantitative Evaluation of Explainable Graph Neural Networks for Molecular Property Prediction. arXiv:2107.04119 [cs, q-bio] 2021.
(14) Jiménez-Luna, J.; Grisoni, F.; Schneider, G. Drug Discovery with Explainable Artificial Intelligence. Nat Mach Intell 2020, 2 (10), 573–584.
(15) Wu, Z.; Ramsundar, B.; Feinberg, E. N.; Gomes, J.; Geniesse, C.; Pappu, A. S.; Leswing, K.; Pande, V. MoleculeNet: A Benchmark for Molecular Machine Learning. Chem. Sci. 2018, 9 (2), 513–530.
(16) Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. Quantum Chemistry Structures and Properties of 134 Kilo Molecules. Scientific Data 2014, 1 (1), 140022.
(17) Delaney, J. S. ESOL:  Estimating Aqueous Solubility Directly from Molecular Structure. J. Chem. Inf. Comput. Sci. 2004, 44 (3), 1000–1005.
(18) Mendez, D.; Gaulton, A.; Bento, A. P.; Chambers, J.; De Veij, M.; Félix, E.; Magariños, M. P.; Mosquera, J. F.; Mutowo, P.; Nowotka, M.; Gordillo-Marañón, M.; Hunter, F.; Junco, L.; Mugumbate, G.; Rodriguez-Lopez, M.; Atkinson, F.; Bosc, N.; Radoux, C. J.; Segura-Cabrera, A.; Hersey, A.; Leach, A. R. ChEMBL: Towards Direct Deposition of Bioassay Data. Nucleic Acids Res 2019, 47 (Database issue), D930–D940.
(19) Sterling, T.; Irwin, J. J. ZINC 15 – Ligand Discovery for Everyone. J. Chem. Inf. Model. 2015, 55 (11), 2324–2337.
(20) Nigam, A.; Pollice, R.; Hurley, M. F. D.; Hickman, R. J.; Aldeghi, M.; Yoshikawa, N.; Chithrananda, S.; Voelz, V. A.; Aspuru-Guzik, A. Assigning Confidence to Molecular Property Prediction. arXiv:2102.11439 [cs] 2021.
(21) Ovadia, Y.; Fertig, E.; Ren, J.; Nado, Z.; Sculley, D.; Nowozin, S.; Dillon, J. V.; Lakshminarayanan, B.; Snoek, J. Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift. arXiv:1906.02530 [cs, stat] 2019.
(22) Kendall, A.; Gal, Y. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? Advances in neural information processing systems 2017, 30.
(23) Hüllermeier, E.; Waegeman, W. Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods. Mach Learn 2021, 110 (3), 457–506.
(24) Kiureghian, A. D.; Ditlevsen, O. Aleatory or Epistemic? Does It Matter? Structural Safety 2009, 31 (2), 105–112.
(25) Kwon, Y.; Won, J.-H.; Kim, B. J.; Paik, M. C. Uncertainty Quantification Using Bayesian Neural Networks in Classification: Application to Biomedical Image Segmentation. Computational Statistics & Data Analysis 2020, 142, 106816.
(26) Busk, J.; Jørgensen, P. B.; Bhowmik, A.; Schmidt, M. N.; Winther, O.; Vegge, T. Calibrated Uncertainty for Molecular Property Prediction Using Ensembles of Message Passing Neural Networks. Mach. Learn.: Sci. Technol. 2021, 3 (1), 015012.
(27) Scalia, G.; Grambow, C. A.; Pernici, B.; Li, Y.-P.; Green, W. H. Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction. J. Chem. Inf. Model. 2020, 60 (6), 2697–2717.
(28) Hao, Z.; Lu, C.; Huang, Z.; Wang, H.; Hu, Z.; Liu, Q.; Chen, E.; Lee, C. ASGN: An Active Semi-Supervised Graph Neural Network for Molecular Property Prediction; KDD ’20; Association for Computing Machinery: New York, NY, USA, 2020; pp 731–752.
(29) Musil, F.; Willatt, M. J.; Langovoy, M. A.; Ceriotti, M. Fast and Accurate Uncertainty Estimation in Chemical Machine Learning. J. Chem. Theory Comput. 2019, 15 (2), 906–915.
(30) Lamb, G.; Paige, B. Bayesian Graph Neural Networks for Molecular Property Prediction. arXiv:2012.02089 [cs, q-bio] 2020.
(31) Soleimany, A. P.; Amini, A.; Goldman, S.; Rus, D.; Bhatia, S. N.; Coley, C. W. Evidential Deep Learning for Guided Molecular Property Prediction and Discovery. ACS Cent. Sci. 2021, 7 (8), 1356–1367.
(32) Kosasih, E. E.; Cabezas, J.; Sumba, X.; Bielak, P.; Tagowski, K.; Idanwekhai, K.; Tjandra, B. A.; Jamasb, A. R. On Graph Neural Network Ensembles for Large-Scale Molecular Property Prediction. arXiv:2106.15529 [cs] 2021.
(33) Imbalzano, G.; Zhuang, Y.; Kapil, V.; Rossi, K.; Engel, E. A.; Grasselli, F.; Ceriotti, M. Uncertainty Estimation for Molecular Dynamics and Sampling. J. Chem. Phys. 2021, 154 (7), 074102.
(34) Li, Y.-P.; Han, K.; Grambow, C. A.; Green, W. H. Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry. J. Phys. Chem. A 2019, 123 (10), 2142–2152.
(35) Gubaev, K.; Podryabinkin, E. V.; Shapeev, A. V. Machine Learning of Molecular Properties: Locality and Active Learning. The Journal of Chemical Physics 2018, 148 (24), 241727.
(36) Wang, H.; Yeung, D.-Y. Towards Bayesian Deep Learning: A Framework and Some Existing Methods. IEEE Transactions on Knowledge and Data Engineering 2016, 28 (12), 3395–3408.
(37) Kucukelbir, A. Automatic Differentiation Variational Inference. Journal of machine learning research 2017, 1–45.
(38) Lakshminarayanan, B.; Pritzel, A.; Blundell, C. Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles. In Advances in neural information processing systems; Curran Associates, Inc., 2017; Vol. 30.
(39) Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. PMLR 2016, p 1050-1059.
(40) Blundell, C.; Cornebise, J.; Kavukcuoglu, K.; Wierstra, D. Weight Uncertainty in Neural Network. In Proceedings of the 32nd International Conference on Machine Learning; PMLR, 2015; pp 1613–1622.
(41) Lin, Z.; Trivedi, S.; Sun, J. Locally Valid and Discriminative Confidence Intervals for Deep Learning Models. arXiv:2106.00225 [cs, stat] 2021.
(42) Romano, Y.; Patterson, E.; Candes, E. Conformalized Quantile Regression. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2019; Vol. 32.
(43) Hirschfeld, L.; Swanson, K.; Yang, K.; Barzilay, R.; Coley, C. W. Uncertainty Quantification Using Neural Networks for Molecular Property Prediction. J. Chem. Inf. Model. 2020, 60 (8), 3770–3780.
(44) S. Eyke, N.; H. Green, W.; F. Jensen, K. Iterative Experimental Design Based on Active Machine Learning Reduces the Experimental Burden Associated with Reaction Screening. Reaction Chemistry & Engineering 2020, 5 (10), 1963–1972.
(45) Kuleshov, V.; Fenner, N.; Ermon, S. Accurate Uncertainties for Deep Learning Using Calibrated Regression. In Proceedings of the 35th International Conference on Machine Learning; PMLR, 2018; pp 2796–2804.
(46) Laves, M.-H.; Ihler, S.; Fast, J. F.; Kahrs, L. A.; Ortmaier, T. Recalibration of Aleatoric and Epistemic Regression Uncertainty in Medical Imaging. arXiv:2104.12376 [cs, eess] 2021.
(47) Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K. Q. On Calibration of Modern Neural Networks. PMLR 2017, 1321–1330.
(48) Nix, D. A.; Weigend, A. S. Estimating the Mean and Variance of the Target Probability Distribution. In Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94); 1994; Vol. 1, pp 55–60 vol.1.
(49) Cawley, G. C.; Talbot, N. L. C.; Foxall, R. J.; Dorling, S. R.; Mandic, D. P. Heteroscedastic Kernel Ridge Regression. Neurocomputing 2004, 57, 105–124.
(50) Cawley, G. C.; Talbot, N. L. C.; Chapelle, O. Estimating Predictive Variances with Kernel Ridge Regression. In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment; Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F., Eds.; Lecture Notes in Computer Science; Springer: Berlin, Heidelberg, 2006; pp 56–77.
(51) Bernardo, J. M.; Smith, A. F. M. Bayesian Theory; John Wiley & Sons, 2009.
(52) Gilmer, J.; Schoenholz, S. S.; Riley, P. F.; Vinyals, O.; Dahl, G. E. Neural Message Passing for Quantum Chemistry. arXiv:1704.01212 [cs] 2017.
(53) Jørgensen, P. B.; Jacobsen, K. W.; Schmidt, M. N. Neural Message Passing with Edge Updates for Predicting Properties of Molecules and Materials. arXiv:1806.03146 [cs, stat] 2018.
(54) Chen, L.-Y.; Hsu, T.-W.; Hsiung, T.-C.; Li, Y.-P. Deep Learning-Based Increment Theory for Formation Enthalpy Predictions; preprint; Chemistry, 2022. https://doi.org/10.26434/chemrxiv-2022-lc84d.
(55) Bertsekas, D. P.; Tsitsiklis, J. N. Introduction to Probability. Athena Scientific. 2008, 1.
(56) Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer Topics in Signal Processing; Springer Berlin Heidelberg: Berlin, Heidelberg, 2009; Vol. 2, pp 1–4.
(57) Gustafsson, F. K.; Danelljan, M.; Schön, T. B. Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision. arXiv:1906.01620 [cs, stat] 2020.
(58) Levi, D.; Gispan, L.; Giladi, N.; Fetaya, E. Evaluating and Calibrating Uncertainty Prediction in Regression Tasks. arXiv:1905.11659 [cs, stat] 2020.
(59) Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv:1706.03762 [cs] 2017.
(60) Yang, C.-I.; Li, Y.-P. Explainable Uncertainty Quantifications for Deep Learning-Based Molecular Property Prediction. ChemRxiv September 12, 2022.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86958-
dc.description.abstract機器學習應用於化學性質預測等研究在近年受到高度關注。然而,模型表現易受其訓練集之數量與品質影響。在過去化學領域中的研究,已成功透過量化不確定性(Uncertainty Quantification)以定義分子性質預測值的信心程度,並且證實可透過加以訓練模型自認為不熟悉之資料,讓模型的預測表現得以自主進步。然而,理解模型所預測之不確定性仍是尚未探討的問題。本研究開發一基於原子的模型架構,此模型於分子性質預測任務中,將分子性質的預測值與其預測值之不確定性分配到原子上。本研究開發之模型可分別量化兩種不確定性:偶然不確定性(Aleatoric uncertainty)與認知不確定性(Epistemic uncertainty),前者用以形容訓練集本身品質上的誤差,後者形容模型受限於其預測能力,對預測值之不確定性。此架構賦予模型解釋能力(Explainability),透過檢視分子內原子性質預測值理解模型預測行為;透過原子之兩種不確定性,將模型預測不佳之原因歸咎於分子中某類原子或官能基,並透過兩種不確定性,解釋該分子結構預測之困難是源自於訓練資料品質的噪聲或是模型缺乏相關分子結構之訓練資料所造成。此外,本研究提出一個後處理方法以校正集成模型方法(Deep Ensembles)下所預測的偶然不確定性,使偶然不確定性有較好的表現。綜上所述,本研究提出一校正方法以提升不確定性之品質與一模型架構以量化原子性質與原子不確定性,從而理解模型預測行為及其預測不佳之原因。zh_TW
dc.description.abstractRecent advances in machine learning have opened the door to rapid prediction of molecular properties of interest. However, the performances of the data-driven methods on molecular property predictions are not always satisfactory because of the limited size and quality of chemical datasets. Therefore, uncertainty quantification for the data-driven approach has recently gained attention in the chemistry society. Previous studies have successfully quantified uncertainty for molecular property predictions and have shown that it is possible to provide new training data for uncertain samples to improve model performance. However, rationalizing the uncertainty value predicted by the model remains a challenging task. In this work, we design an atom-based framework which can attribute the uncertainty to the chemical structures present in a molecule by quantifying both atomic uncertainties and atomic contributions to the molecular property. Moreover, this method can separate aleatoric and epistemic uncertainties, which capture noise inherent in the data and uncertainty in the model predictions, respectively. In the atom-based framework, both aleatoric and epistemic uncertainties become more explainable on the basis of the substructures in a molecule. Furthermore, we propose a post-hoc method for recalibrating aleatoric uncertainty of the ensemble method for better confidence interval quantification. Overall, this method not only improves uncertainty calibration but also provides a framework for better assessing whether and why a prediction can be considered unreliable.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-05-02T17:04:27Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-05-02T17:04:27Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
中文摘要 iii
Abstract iv
Contents vi
Figure Index viii
Table Index xi
1. Introduction 1
2. Methods 6
2.1 Approximate Aleatoric Uncertainty 6
2.2 Approximate Epistemic Uncertainty 8
2.3 Post-hoc calibration after Deep Ensembles 10
2.4 Molecule- and Atom-based Uncertainty Models 13
2.4.1 Message Passing Phase 13
2.4.2 Readout Phase 14
2.5 Evaluation Metrics 18
2.6 Benchmark Data and Model 21
3. Results 23
3.1 Post-hoc calibration for Deep Ensembles 23
3.2 Prediction Accuracy and Uncertainty Performance 28
3.3 Atomic Uncertainty 30
3.3.1 Atomic Aleatoric Uncertainty 33
3.3.2 Atomic Epistemic Uncertainty 36
4. Conclusions and Future Work 40
5. Declarations 42
6. References 43
7. Supporting Information 48
7.1 Availability of materials 48
7.2 Correlation coefficient matrix 48
7.3 Confidence-based Calibration Curve 51
7.4 Error-based Calibration Curve 55
7.5 Initial Featurization 59 
-
dc.language.isoen-
dc.subject不確定性量化zh_TW
dc.subject可解釋人工智慧zh_TW
dc.subject集成學習zh_TW
dc.subject概率校正zh_TW
dc.subject分子性質預測zh_TW
dc.subject偶然不確定性zh_TW
dc.subject認知不確定性zh_TW
dc.subjectprobability calibrationen
dc.subjectuncertainty quantificationen
dc.subjectaleatoric uncertaintyen
dc.subjectepistemic uncertaintyen
dc.subjectmolecular property predictionen
dc.subjectexplainable AIen
dc.subjectDeep Ensemblesen
dc.title深度學習分子性質預測之原子不確定性量化與校正zh_TW
dc.titleCalibrated Atomic Uncertainty Quantifications for Deep Learning-Based Molecular Property Predictionen
dc.typeThesis-
dc.date.schoolyear111-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee徐振哲;林立強zh_TW
dc.contributor.oralexamcommitteeCheng-Che Hsu;Li-Chiang Linen
dc.subject.keyword不確定性量化,偶然不確定性,認知不確定性,分子性質預測,可解釋人工智慧,集成學習,概率校正,zh_TW
dc.subject.keyworduncertainty quantification,aleatoric uncertainty,epistemic uncertainty,molecular property prediction,explainable AI,Deep Ensembles,probability calibration,en
dc.relation.page61-
dc.identifier.doi10.6342/NTU202300051-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-01-11-
dc.contributor.author-college工學院-
dc.contributor.author-dept化學工程學系-
顯示於系所單位:化學工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-1.pdf20.81 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved