請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97537完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李奕霈 | zh_TW |
| dc.contributor.advisor | Yi-Pei Li | en |
| dc.contributor.author | 陳隆奕 | zh_TW |
| dc.contributor.author | Lung-Yi Chen | en |
| dc.date.accessioned | 2025-07-02T16:21:26Z | - |
| dc.date.available | 2025-07-03 | - |
| dc.date.copyright | 2025-07-02 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-06-17 | - |
| dc.identifier.citation | [1] Shu Jiang, Zhuosheng Zhang, Hai Zhao, Jiangtong Li, Yang Yang, Bao-Liang Lu, and Ning Xia. When smiles smiles, practicality judgment and yield prediction of chemical reaction via deep chemical language processing. IEEE Access, 9:85071–85083, 2021.
[2] Daniel Probst, Philippe Schwaller, and Jean-Louis Reymond. Reaction classification and yield prediction using the differential reaction fingerprint drfp. Digital discovery, 1(2):91–97, 2022. [3] Mandana Saebi, Bozhao Nan, John E Herr, Jessica Wahlers, Zhichun Guo, Andrzej M Zurański, Thierry Kogej, Per-Ola Norrby, Abigail G Doyle, and Nitesh V Chawla. On the use of real-world datasets for reaction yield prediction. Chemical Science, 14(19):4997–5005, 2023. [4] Philippe Schwaller, Alain C Vaucher, Teodoro Laino, and Jean-Louis Reymond. Prediction of chemical reaction yields using deep learning. Machine learning: science and technology, 2(1):015016, 2021. [5] Connor W Coley, Regina Barzilay, Tommi S Jaakkola, William H Green, and Klavs F Jensen. Prediction of organic reaction outcomes using machine learning. ACS central science, 3(5):434–443, 2017. [6] Connor W Coley, Wengong Jin, Luke Rogers, Timothy F Jamison, Tommi S Jaakkola, William H Green, Regina Barzilay, and Klavs F Jensen. A graphconvolutional neural network model for the prediction of chemical reactivity. Chemical science, 10(2):370–377, 2019. [7] Kien Do, Truyen Tran, and Svetha Venkatesh. Graph transformation policy network for chemical reaction prediction. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 750–760, 2019. [8] David Fooshee, Aaron Mood, Eugene Gutman, Mohammadamin Tavakoli, Gregor Urban, Frances Liu, Nancy Huynh, David Van Vranken, and Pierre Baldi. Deep learning for chemical reaction prediction. Molecular Systems Design & Engineering, 3(3):442–452, 2018. [9] Philippe Schwaller, Teodoro Laino, Théophile Gaudin, Peter Bolgar, Christopher A Hunter, Costas Bekas, and Alpha A Lee. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS central science, 5(9):1572–1583, 2019. [10] Shuan Chen and Yousung Jung. A generalized-template-based graph neural network for accurate organic reactivity prediction. Nature Machine Intelligence, 4(9):772–780, 2022. [11] Connor W Coley, William H Green, and Klavs F Jensen. Machine learning in computer-aided synthesis planning. Accounts of chemical research, 51(5):1281–1289, 2018. [12] Connor W Coley, Luke Rogers, William H Green, and Klavs F Jensen. Computerassisted retrosynthesis based on molecular similarity. ACS central science, 3(12):1237–1245, 2017. [13] Jingxin Dong, Mingyi Zhao, Yuansheng Liu, Yansen Su, and Xiangxiang Zeng. Deep learning in retrosynthesis planning: datasets, models and tools. Briefings in Bioinformatics, 23(1):bbab391, 2022. [14] John S Schreck, Connor W Coley, and Kyle JM Bishop. Learning retrosynthetic planning through simulated experience. ACS central science, 5(6):970–981, 2019. [15] Zhengkai Tu and Connor W Coley. Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. Journal of chemical information and modeling, 62(15):3503–3513, 2022. [16] Weihe Zhong, Ziduo Yang, and Calvin Yu-Chian Chen. Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing. Nature Communications, 14(1):3009, 2023. [17] Tiantao Liu, Zheng Cao, Yuansheng Huang, Yue Wan, Jian Wu, Chang-Yu Hsieh, Tingjun Hou, and Yu Kang. Syncluster: Reaction type clustering and recommendation framework for synthesis planning. JACS Au, 3(12):3446–3461, 2023. [18] Marwin HS Segler, Mike Preuss, and Mark P Waller. Planning chemical syntheses with deep neural networks and symbolic ai. Nature, 555(7698):604–610, 2018. [19] Venkat Venkatasubramanian and Vipul Mann. Artificial intelligence in reaction prediction and chemical synthesis. Current Opinion in Chemical Engineering, 36:100749, 2022. [20] Lin Yao, Wentao Guo, Zhen Wang, Shang Xiang, Wentan Liu, and Guolin Ke. Node-aligned graph-to-graph: Elevating template-free deep learning approaches in single-step retrosynthesis. JACS Au, 2024. [21] Kevin Zhang, Vipul Mann, and Venkat Venkatasubramanian. G-matt: Singlestep retrosynthesis prediction using molecular grammar tree transformer. AIChE Journal, 70(1):e18244, 2024. [22] Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia, Shaolun Yao, Tingjun Hou, and Mingli Song. Recent advances in deep learning for retrosynthesis. Wiley Interdisciplinary Reviews: Computational Molecular Science, 14(1):e1694, 2024. [23] Shuan Chen and Yousung Jung. Deep retrosynthetic reaction prediction using local reactivity and global attention. JACS Au, 1(10):1612–1620, 2021. [24] Lung-Yi Chen and Yi-Pei Li. Enhancing chemical synthesis: a two-stage deep neural network for predicting feasible reaction conditions. Journal of Cheminformatics, 16(1):1–14, 2024. [25] Hanyu Gao, Thomas J Struble, Connor W Coley, Yuran Wang, William H Green, and Klavs F Jensen. Using machine learning to predict suitable conditions for organic reactions. ACS central science, 4(11):1465–1476, 2018. [26] Youngchun Kwon, Sun Kim, Youn-Suk Choi, and Seokho Kang. Generative modeling to predict multiple suitable conditions for chemical reactions. Journal ofChemical Information and Modeling, 62(23):5952–5960, 2022. [27] Michael R Maser, Alexander Y Cui, Serim Ryou, Travis J DeLano, Yisong Yue, and Sarah E Reisman. Multilabel classification models for the prediction of crosscoupling reaction conditions. Journal of Chemical Information and Modeling, 61(1):156–166, 2021. [28] Derek T Ahneman, Jesús G Estrada, Shishi Lin, Spencer D Dreher, and Abigail G Doyle. Predicting reaction performance in c–n cross-coupling using machine learning. Science, 360(6385):186–190, 2018. [29] Yurui Chen and Louxin Zhang. How much can deep learning improve prediction of the responses to drugs in cancer cell lines? Briefings in bioinformatics, 23(1):bbab378, 2022. [30] Baiqing Li, Shimin Su, Chan Zhu, Jie Lin, Xinyue Hu, Lebin Su, Zhunzhun Yu, Kuangbiao Liao, and Hongming Chen. A deep learning framework for accurate reaction prediction and its application on high-throughput experimentation data. Journal of Cheminformatics, 15(1):1–12, 2023. [31] Jane Panteleev, Hua Gao, and Lei Jia. Recent applications of machine learning in medicinal chemistry. Bioorganic & medicinal chemistry letters, 28(17):2807–2815, 2018. [32] Lung-Yi Chen and Yi-Pei Li. Machine Learning Applications in Chemical Kinetics and Thermochemistry, pages 203–226. Springer International Publishing, 2023. [33] Daniel Lowe. Chemical reactions from us patents (1976-sep2016), 2017. [34] Steven M Kearnes, Michael R Maser, Michael Wleklinski, Anton Kast, Abigail G Doyle, Spencer D Dreher, Joel M Hawkins, Klavs F Jensen, and Connor W Coley. The open reaction database. Journal of the American Chemical Society, 143(45):18820–18826, 2021. [35] Nextmove software pistachio, 2023. [36] Reaxys, 2023. [37] Cas, scifinder-n, 2023. [38] Dana L Roth. Spresiweb 2.1, a selective chemical synthesis and reaction database, 2005. [39] Felix Strieth‐Kalthoff, Frederik Sandfort, Marius Kühnemund, Felix R Schäfer, Herbert Kuchen, and Frank Glorius. Machine learning for chemical reactivity: The importance of failed experiments. Angewandte Chemie International Edition, 61(29):e202204647, 2022. [40] Jules Schleinitz, Maxime Langevin, Yanis Smail, Benjamin Wehnert, Laurence Grimaud, and Rodolphe Vuilleumier. Machine learning yield prediction from nicolit, a small-size literature data set of nickel catalyzed c–o couplings. Journal of the American Chemical Society, 144(32):14722–14730, 2022. [41] Babak Mahjour, Jillian Hoffstadt, and Tim Cernak. Designing chemical reaction arrays using phactor and chatgpt. Organic Process Research & Development, 27(8):1510–1516, 2023. [42] Jason Y Wang, Jason M Stevens, Stavros K Kariofillis, Mai-Jan Tom, Dung L Golden, Jun Li, Jose E Tabora, Marvin Parasram, Benjamin J Shields, David N Primer, et al. Identifying general reaction conditions by bandit optimization. Nature, 626(8001):1025–1033, 2024. [43] Jason M Stevens, Jun Li, Eric M Simmons, Steven R Wisniewski, Stacey DiSomma, Kenneth J Fraunhoffer, Peng Geng, Bo Hao, and Erika W Jackson. Advancing base metal catalysis through data science: insight and predictive models for ni-catalyzed borylation through supervised machine learning. Organometallics, 41(14):1847–1864, 2022. [44] Alexander Buitrago Santanilla, Erik L Regalado, Tony Pereira, Michael Shevlin, Kevin Bateman, Louis-Charles Campeau, Jonathan Schneeweis, Simon Berritt, Zhi-Cai Shi, and Philippe Nantermet. Nanomole-scale high-throughput chemistry for the synthesis of complex molecules. Science, 347(6217):49–53, 2015. [45] Martin Fitzner, Georg Wuitschik, Raffael Koller, Jean-Michel Adam, and Torsten Schindler. Machine learning c–n couplings: Obstacles for a general-purpose reaction yield prediction. ACS omega, 8(3):3017–3025, 2023. [46] Yougen Xu, Feixiao Ren, Lebin Su, Zhaoping Xiong, Xinwei Zhu, Xinyuan Lin, Nan Qiao, Hao Tian, Changen Tian, and Kuangbiao Liao. Hte and machine learning-assisted development of iridium (i)-catalyzed selective o–h bond insertion reactions toward carboxymethyl ketones. Organic Chemistry Frontiers, 10(5):1153–1159, 2023. [47] Damith Perera, Joseph W Tucker, Shalini Brahmbhatt, Christopher J Helal, Ashley Chong, William Farrell, Paul Richardson, and Neal W Sach. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science, 359(6374):429–434, 2018. [48] Samuel H Newman-Stonebraker, Sleight R Smith, Julia E Borowski, Ellyn Peters, Tobias Gensch, Heather C Johnson, Matthew S Sigman, and Abigail G Doyle. Univariate classification of phosphine ligation state and reactivity in cross-coupling catalysis. Science, 374(6565):301–308, 2021. [49] Brandon J Reizman, Yi-Ming Wang, Stephen L Buchwald, and Klavs F Jensen. Suzuki–miyaura cross-coupling optimization enabled by automated feedback. Reaction chemistry & engineering, 1(6):658–666, 2016. [50] Eric S Isbrandt, Devon E Chapple, Nguyen Thien Phuc Tu, Victoria Dimakos, Anne Marie M Beardall, Paul D Boyle, Christopher N Rowley, Johanna M Blacquiere, and Stephen G Newman. Controlling reactivity and selectivity in the mizoroki–heck reaction: High throughput evaluation of 1, 5-diaza-3, 7-diphosphacyclooctane ligands. Journal of the American Chemical Society, 146(8):5650–5660, 2024. [51] Nicholas H Angello, Vandana Rathore, Wiktor Beker, Agnieszka Wołos, Edward R Jira, Rafał Roszak, Tony C Wu, Charles M Schroeder, Alán Aspuru-Guzik, and Bartosz A Grzybowski. Closed-loop optimization of general reaction conditions for heteroaryl suzuki-miyaura coupling. Science, 378(6618):399–405, 2022. [52] Peter S Kutchukian, James F Dropinski, Kevin D Dykstra, Bing Li, Daniel A DiRocco, Eric C Streckfuss, Louis-Charles Campeau, Tim Cernak, Petr Vachal, and Ian W Davies. Chemistry informer libraries: a chemoinformatics enabled approach to evaluate and advance synthetic methods. Chemical Science, 7(4):2604–2613, 2016. [53] Matthew K Nielsen, Derek T Ahneman, Orestes Riera, and Abigail G Doyle. Deoxyfluorination with sulfonyl fluorides: navigating reaction space with machine learning. Journal of the American Chemical Society, 40(15):5004–5008, 2018. [54] Alexander Stadler and C Oliver Kappe. Automated library generation using sequential microwave-assisted chemistry. application toward the biginelli multicomponent condensation. Journal of combinatorial chemistry, 3(6):624–630, 2001. [55] Benjamin J Shields, Jason Stevens, Jun Li, Marvin Parasram, Farhan Damani, Jesus I Martinez Alvarado, Jacob M Janey, Ryan P Adams, and Abigail G Doyle. Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590(7844):89–96, 2021. [56] Antimo Gioiello, Emiliano Rosatelli, Michela Teofrasti, Paolo Filipponi, and Roberto Pellicciari. Building a sulfonamide library by eco-friendly flow synthesis. ACS Combinatorial Science, 15(5):235–239, 2013. [57] Travis J DeLano and Sarah E Reisman. Enantioselective electroreductive coupling of alkenyl and benzyl halides via nickel catalysis. ACS catalysis, 9(8):6751–6754, 2019. [58] Zhiwei Zuo, Derek T Ahneman, Lingling Chu, Jack A Terrett, Abigail G Doyle, and David WC MacMillan. Merging photoredox with nickel catalysis: Coupling of α-carboxyl sp3-carbons with aryl halides. Science, 345(6195):437–440, 2014. [59] Rocío Mercado, Steven M Kearnes, and Connor W Coley. Data sharing in chemistry: lessons learned and a case for mandating structured reaction data. Journal of Chemical Information and Modeling, 63(14):4253–4265, 2023. [60] Priyanka Raghavan, Brittany C Haas, Madeline E Ruos, Jules Schleinitz, Maxime Langevin, Yanis Smail, Benjamin Wehnert, Laurence Grimaud, and Rodolphe Vuilleumier. Dataset design for building models of chemical reactivity. ACS Central Science, 9(12):2196–2204, 2023. [61] Timur R Gimadiev, Arkadii Lin, Valentina A Afonina, Dinar Batyrshin, Ramil I Nugmanov, Tagir Akhmetshin, Pavel Sidorov, Natalia Duybankova, Jonas Verhoeven, and Joerg Wegner. Reaction data curation i: chemical structures and transformations standardization. Molecular Informatics, 40(12):2100119, 2021. [62] William Lingran Chen, David Z Chen, and Keith T Taylor. Automatic reaction mapping and reaction center detection. Wiley Interdisciplinary Reviews: Computational Molecular Science, 3(6):560–593, 2013. [63] Arkadii Lin, Natalia Dyubankova, Timur I Madzhidov, Ramil I Nugmanov, Jonas Verhoeven, Timur R Gimadiev, Valentina A Afonina, Zarina Ibragimova, Assima Rakhimbekova, and Pavel Sidorov. Atom‐to‐atom mapping: a benchmarking study of popular mapping algorithms and consensus strategies. Molecular Informatics, 41(4):2100138, 2022. [64] Ramil Nugmanov, Natalia Dyubankova, Andrey Gedich, and Joerg Kurt Wegner. Bidirectional graphormer for reactivity understanding: neural network trained to reaction atom-to-atom mapping task. Journal of Chemical Information and Modeling, 62(14):3307–3315, 2022. [65] Philippe Schwaller, Benjamin Hoover, Jean-Louis Reymond, Hendrik Strobelt, and Teodoro Laino. Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Science Advances, 7(15):eabe4166, 2021. [66] Ramil I Nugmanov, Ravil N Mukhametgaleev, Tagir Akhmetshin, Timur R Gimadiev, Valentina A Afonina, Timur I Madzhidov, and Alexandre Varnek. Cgrtools: Python library for molecule, reaction, and condensed graph of reaction processing. Journal of chemical information and modeling, 59(6):2516–2521, 2019. [67] Alain C Vaucher, Philippe Schwaller, and Teodoro Laino. Completion of partial reaction equations. Chemrxiv, 2020. [68] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. arXiv preprint arXiv:1706.03762, 2017. [69] Alessandra Toniato, Philippe Schwaller, Antonio Cardinale, Joppe Geluykens, and Teodoro Laino. Unassisted noise reduction of chemical reaction datasets. Nature Machine Intelligence, 3(6):485–494, 2021. [70] Ian J Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211, 2013. [71] Uschi Dolfus, Hans Briem, and Matthias Rarey. Visualizing generic reaction patterns. Journal of Chemical Information and Modeling, 62(19):4680–4689, 2022. [72] Lung-Yi Chen. Autotemplate. [73] Rdkit: Open-source cheminformatics software. [74] David Fooshee, Alessio Andronico, and Pierre Baldi. Reactionmap: An efficient atom-mapping algorithm for chemical reactions. Journal of chemical information and modeling, 53(11):2812–2819, 2013. [75] Wojciech Jaworski, Sara Szymkuć, Barbara Mikulak-Klucznik, Krzysztof Piecuch, Tomasz Klucznik, Michał Kaźmierowski, Jan Rydzewski, Anna Gambin, and Bartosz A Grzybowski. Automatic mapping of atoms across both simple and complex chemical reactions. Nature communications, 10(1):1434, 2019. [76] Connor W Coley, William H Green, and Klavs F Jensen. Rdchiral: An rdkit wrapper for handling stereochemistry in retrosynthetic template extraction and application. Journal of chemical information and modeling, 59(6):2529–2537, 2019. [77] Daylight smarts documentation. [78] Edsger W Dijkstra. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy, pages 287–290. 2022. [79] Babak A Mahjour and Connor W Coley. Rdcanon: A python package for canonicalizing the order of tokens in smarts queries. Journal of Chemical Information and Modeling, 2024. [80] Clemens Jochum, Johann Gasteiger, and Ivar Ugi. The principle of minimum chemical distance (pmcd). Angewandte Chemie International Edition in English, 19(7):495–505, 1980. [81] Shuan Chen, Sunggi An, Ramil Babazade, and Yousung Jung. Precise atomto- atom mapping for organic reactions via human-in-the-loop machine learning. Nature Communications, 15(1):2250, 2024. [82] Kaspar Riesen, Xiaoyi Jiang, and Horst Bunke. Exact and inexact graph matching: Methodology and applications. Managing and mining graph data, pages 217–247, 2010. [83] Christopher D McNitt and Vladimir V Popik. Photochemical generation of oxadibenzocyclooctyne (odibo) for metal-free click ligations. Organic & biomolecular chemistry, 10(41):8200–8202, 2012. [84] Anthony Cook, A Peter Johnson, James Law, Mahdi Mirzazadeh, Orr Ravitz, and Aniko Simon. Computer‐ aided synthesis design: 40 years on. Wiley Interdisciplinary Reviews: Computational Molecular Science, 2(1):79–107, 2012. [85] Fan Feng, Luhua Lai, and Jianfeng Pei. Computational chemical synthesis analysis and pathway design. Frontiers in chemistry, 6:199, 2018. [86] Martha M Flores-Leonar, Luis M Mejía-Mendoza, Andrés Aguilar-Granda, Benjamin Sanchez-Lengeling, Hermann Tribukait, Carlos Amador-Bedolla, and Alán Aspuru-Guzik. Materials acceleration platforms: On the way to autonomous experimentation. Current Opinion in Green and Sustainable Chemistry, 25:100370, 2020. [87] Connor W Coley, Luke Rogers, William H Green, and Klavs F Jensen. Computerassisted retrosynthesis based on molecular similarity. ACS central science, 3(12):1237–1245, 2017. [88] Bowen Liu, Bharath Ramsundar, Prasad Kawthekar, Jade Shi, Joseph Gomes, Quang Luu Nguyen, Stephen Ho, Jack Sloane, Paul Wender, and Vijay Pande. Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS central science, 3(10):1103–1113, 2017. [89] Tanja Gaich and Phil S Baran. Aiming for the ideal synthesis. The Journal of organic chemistry, 75(14):4657–4673, 2010. [90] Li He, Yilin Fan, Jérôme Bellettre, Jun Yue, and Lingai Luo. A review on catalytic methane combustion at low temperatures: Catalysts, mechanisms, reaction conditions and reactor designs. Renewable and Sustainable Energy Reviews, 119:109589, 2020. [91] Mikhail Andronov, Varvara Voinarovska, Natalia Andronova, Michael Wand, Djork-Arné Clevert, and Jürgen Schmidhuber. Reagent prediction with a molecular transformer improves reaction data quality. Chemical Science, 14(12):3235–3246, 2023. [92] Yulong Gu, Zhuoye Ding, Shuaiqiang Wang, Lixin Zou, Yiding Liu, and Dawei Yin. Deep multifaceted transformers for multi-objective ranking in large-scale ecommerce recommender systems. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 2493–2500, 2020. [93] Paul Covington, Jay Adams, and Emre Sargin. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems, pages 191–198, 2016. [94] Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi. Recommending what video to watch next: a multitask ranking system. In Proceedings of the 13th ACM Conference on Recommender Systems, pages 43–51, 2019. [95] Maarten R Dobbelaere, István Lengyel, Christian V Stevens, and Kevin M Van Geem. Rxn-insight: fast chemical reaction analysis using bond-electron matrices. Journal of Cheminformatics, 16(1):37, 2024. [96] Qian-Nan Hu, Zhe Deng, Huanan Hu, Dong-Sheng Cao, and Yi-Zeng Liang. Rxnfinder: biochemical reaction search engines using molecular structures, molecular fragments and reaction similarity. Bioinformatics, 27(17):2465–2467, 2011. [97] Min-Ling Zhang and Zhi-Hua Zhou. A review on multi-label learning algorithms. IEEE transactions on knowledge and data engineering, 26(8):1819–1837, 2013. [98] Daniel M Lowe, Peter T Corbett, Peter Murray-Rust, and Robert C Glen. Chemical name to structure: Opsin, an open source solution, 2011. [99] Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, and Stephen H Bryant. Pubchem: a public information system for analyzing bioactivities of small molecules. Nucleic acids research, 37:W623–W633, 2009. [100] Harry E Pence and Antony Williams. Chemspider: an online chemical information resource, 2010. [101] Nadine Schneider, Daniel M Lowe, Roger A Sayle, and Gregory A Landrum. Development of a novel fingerprint for chemical reactions and its application to largescale reaction classification and similarity. Journal of chemical information and modeling, 55(1):39–53, 2015. [102] David Rogers and Mathew Hahn. Extended-connectivity fingerprints. Journal of chemical information and modeling, 50(5):742–754, 2010. [103] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. [104] Alex Kendall, Yarin Gal, and Roberto Cipolla. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7482–7491, 2018. [105] Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning, pages 129–136, 2007. [106] Carl Poelking, Gianni Chessari, Christopher W Murray, Richard J Hall, Lucy Colwell, and Marcel Verdonk. Meaningful machine learning models and machinelearned pharmacophores from fragment screening campaigns. arXiv preprint arXiv:2204.06348, 2022. [107] Michael P Maloney, Connor W Coley, Samuel Genheden, Nessa Carson, Paul Helquist, Per-Ola Norrby, and Olaf Wiest. Negative data in data sets for machine learning training, 2023. [108] Austin Tripp, Krzysztof Maziarz, Sarah Lewis, Marwin Segler, and José Miguel Hernández-Lobato. Retro-fallback: retrosynthetic planning in an uncertain world. arXiv preprint arXiv:2310.09270, 2023. [109] Chong Chen, Weizhi Ma, Min Zhang, Chenyang Wang, Yiqun Liu, and Shaoping Ma. Revisiting negative sampling vs. non-sampling in implicit recommendation. ACM Transactions on Information Systems (TOIS), 2022. [110] Jingtao Ding, Yuhan Quan, Xiangnan He, Yong Li, and Depeng Jin. Reinforced negative sampling for recommendation with exposure data. In IJCAI, pages 2230– 2236. Macao, 2019. [111] Hong-Jian Xue, Xinyu Dai, Jianbing Zhang, Shujian Huang, and Jiajun Chen. Deep matrix factorization models for recommender systems. In IJCAI, volume 17, pages 3203–3209. Melbourne, Australia, 2017. [112] Ji Yang, Xinyang Yi, Derek Zhiyuan Cheng, Lichan Hong, Yang Li, Simon Xiaoming Wang, Taibai Xu, and Ed H Chi. Mixed negative sampling for learning two-tower neural networks in recommendations. In Companion Proceedings of the Web Conference 2020, pages 441–447, 2020. [113] Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. From distillation to hard negative sampling: Making sparse neural ir models more effective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2353–2359, 2022. [114] Afrina Tabassum, Muntasir Wahed, Hoda Eldardiry, and Ismini Lourentzou. Hard negative sampling strategies for contrastive representation learning. arXiv preprint arXiv:2206.01197, 2022. [115] Gaurav Bhargava, Mohinder P Mahajan, and Takao Saito. Regio-and chemoselective unprecedented imino-diels-alder reactions of 1-substituted unactivated dienes with n-aryl imines-part ii. Synlett, 2008(07):983–986, 2008. [116] Frédéric Pin, Sébastien Comesse, Bernard Garrigues, Štefan Marchalín, and Adam Daïch. Intermolecular and intramolecular α-amidoalkylation reactions using bismuth triflate as the catalyst. The Journal of organic chemistry, 72(4):1181–1191, 2007. [117] Liliana Boiaryna, Mohamed Kamal El Mkaddem, Catherine Taillier, Vincent Dalla, and Mohamed Othman. Dual hard/soft gold catalysis: Intermolecular friedel– crafts‐type α‐amidoalkylation/alkyne hydroarylation sequences by n‐acyliminium ion chemistry. Chemistry–A European Journal, 18(44):14192–14200, 2012. [118] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models. ACM computing surveys (CSUR), 51(5):1–42, 2018. [119] Feiyu Xu, Hans Uszkoreit, Yangzhou Du, Wei Fan, Dongyan Zhao, and Jun Zhu. Explainable ai: A brief survey on history, research areas, approaches and challenges. In CCF international conference on natural language processing and Chinese computing, pages 563–574. Springer, 2019. [120] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008. [121] Toshifumi Mori and Shigeki Kato. Grignard reagents in solution: Theoretical study of the equilibria and the reaction with a carbonyl compound in diethyl ether solvent. The Journal of Physical Chemistry A, 113(21):6158–6165, 2009. [122] Shicheng Shi and Michal Szostak. Efficient synthesis of diaryl ketones by nickelcatalyzed negishi cross‐coupling of amides by carbon–nitrogen bond cleavage at room temperature accelerated by a solvent effect. Chemistry–A European Journal, 22(30):10420–10424, 2016. [123] Sidney W Benson and Jerry H Buss. Additivity rules for the estimation of molecular properties. thermodynamic properties. The Journal of Chemical Physics, 29(3):546–572, 1958. [124] Steven M Bachrach. The group equivalent reaction: An improved method for determining ring strain energy. Journal of Chemical Education, 67(11):907, 1990. [125] Todor Dudev and Carmay Lim. Ring strain energies from ab initio calculations. Journal of the American Chemical Society, 120(18):4450–4458, 1998. [126] Galen T Craven, Nicholas Lubbers, Kipton Barros, and Sergei Tretiak. Machine learning approaches for structural and thermodynamic properties of a lennard-jones fluid. The Journal of Chemical Physics, 153(10), 2020. [127] Shotaro Shiba Funai and Dimitrios Giataganas. Thermodynamics and feature extraction by machine learning. Physical Review Research, 2(3):033415, 2020. [128] Larry A Curtiss, Paul C Redfern, and Krishnan Raghavachari. Gaussian-4 theory. The Journal of chemical physics, 126(8), 2007. [129] Qiyuan Zhao, Nicolae C Iovanac, and Brett M Savoie. Transferable ring corrections for predicting enthalpy of formation of cyclic compounds. Journal of Chemical Information and Modeling, 61(6):2798–2805, 2021. [130] Qiyuan Zhao and Brett M Savoie. Self-consistent component increment theory for predicting enthalpy of formation. Journal of Chemical Information and Modeling, 60(4):2199–2207, 2020. [131] Kehang Han, Adeel Jamal, Colin A Grambow, Zachary J Buras, and William H Green. An extended group additivity method for polycyclic thermochemistry estimation. International Journal of Chemical Kinetics, 50(4):294–303, 2018. [132] Eric W Lemmon, Marcia L Huber, Mark O McLinden, et al. Nist standard reference database 23. Reference fluid thermodynamic and transport properties (REFPROP), version, 9, 2010. [133] Matthias Rupp, Alexandre Tkatchenko, Klaus-Robert Müller, and O Anatole Von Lilienfeld. Fast and accurate modeling of molecular atomization energies with machine learning. Physical review letters, 108(5):058301, 2012. [134] Jörg Behler. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. The Journal of chemical physics, 134(7), 2011. [135] Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, and Patrick Riley. Molecular graph convolutions: moving beyond fingerprints. Journal of computer-aided molecular design, 30:595–608, 2016. [136] Hanjun Dai, Bo Dai, and Le Song. Discriminative embeddings of latent variable models for structured data. In International conference on machine learning, pages 2702–2711. PMLR, 2016. [137] Kevin Yang, Kyle Swanson, Wengong Jin, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, et al. Analyzing learned molecular representations for property prediction. Journal of chemical information and modeling, 59(8):3370–3388, 2019. [138] Jörg Behler and Michele Parrinello. Generalized neural-network representation of high-dimensional potential-energy surfaces. Physical review letters, 98(14):146401, 2007. [139] Logan Ward, Naveen Dandu, Ben Blaiszik, Badri Narayanan, Rajeev S Assary, Paul C Redfern, Ian Foster, and Larry A Curtiss. Graph-based approaches for predicting solvation energy in multiple solvents: open datasets and machine learning models. The Journal of Physical Chemistry A, 125(27):5990–5998, 2021. [140] Colin A Grambow, Yi-Pei Li, and William H Green. Accurate thermochemistry with small data sets: A bond additivity correction and transfer learning approach. The Journal of Physical Chemistry A, 123(27):5826–5835, 2019. [141] Raghunathan Ramakrishnan, Pavlo O Dral, Matthias Rupp, and O Anatole Von Lilienfeld. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014. [142] Aron J Cohen, Paula Mori-Sánchez, and Weitao Yang. Challenges for density functional theory. Chemical reviews, 112(1):289–320, 2012. [143] Narbe Mardirossian and Martin Head-Gordon. ωb97x-v: A 10-parameter, rangeseparated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy. Physical Chemistry Chemical Physics, 16(21):9904–9924, 2014. [144] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008. [145] Martin Wattenberg, Fernanda Viégas, and Ian Johnson. How to use t-sne effectively. Distill, 1(10):e2, 2016. [146] Mengjie Liu, Alon Grinberg Dana, Matthew S Johnson, Mark J Goldman, Agnes Jocher, A Mark Payne, Colin A Grambow, Kehang Han, Nathan W Yee, Emily J Mazeau, et al. Reaction mechanism generator v3. 0: advances in automatic mechanism generation. Journal of Chemical Information and Modeling, 61(6):2686– 2696, 2021. [147] Kristof T Schütt, Huziel E Sauceda, P-J Kindermans, Alexandre Tkatchenko, and K-R Müller. Schnet–a deep learning architecture for molecules and materials. The Journal of Chemical Physics, 148(24), 2018. [148] Kristof T Schütt, Farhad Arbabzadah, Stefan Chmiela, Klaus R Müller, and Alexandre Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nature communications, 8(1):13890, 2017. [149] Christian Devereux, Justin S Smith, Kate K Huddleston, Kipton Barros, Roman Zubatyuk, Olexandr Isayev, and Adrian E Roitberg. Extending the applicability of the ani deep learning molecular potential to sulfur and halogens. Journal of Chemical Theory and Computation, 16(7):4192–4202, 2020. [150] Justin S Smith, Olexandr Isayev, and Adrian E Roitberg. Ani-1: an extensible neural network potential with dft accuracy at force field computational cost. Chemical science, 8(4):3192–3203, 2017. [151] Justin S Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, and Adrian E Roitberg. Less is more: Sampling chemical space with active learning. The Journal of chemical physics, 148(24), 2018. [152] Peikun Zheng, Wudi Yang, Wei Wu, Olexandr Isayev, and Pavlo O Dral. Toward chemical accuracy in predicting enthalpies of formation with general-purpose datadriven methods. The Journal of Physical Chemistry Letters, 13(15):3479–3491, 2022. [153] Connie W Gao, Joshua W Allen, William H Green, and Richard H West. Reaction mechanism generator: Automatic construction of chemical kinetic mechanisms. Computer Physics Communications, 203:212–225, 2016. [154] Xiangyun Lei and Andrew J Medford. A universal framework for featurization of atomistic systems. The Journal of Physical Chemistry Letters, 13(34):7911–7919, 2022. [155] Raymond C Fort and Paul von R Schleyer. Adamantane: consequences of the diamondoid structure. Chemical Reviews, 64(3):277–300, 1964. [156] Yu Cheng, Yongshun Gong, Yuansheng Liu, Bosheng Song, and Quan Zou. Molecular design in drug discovery: a comprehensive review of deep generative models. Briefings in bioinformatics, 22(6):bbab344, 2021. [157] Jicheng Yi, Guangye Zhang, Han Yu, and He Yan. Advantages, challenges and molecular design of different material types used in organic solar cells. Nature Reviews Materials, 9(1):46–62, 2024. [158] Dylan M Anstine and Olexandr Isayev. Generative models as an emerging paradigm in the chemical sciences. Journal of the American Chemical Society, 145(16):8736– 8750, 2023. [159] Qiyuan Yu, Qiuzhen Lin, Junkai Ji, Wei Zhou, Shan He, Zexuan Zhu, and Kay Chen Tan. A survey on evolutionary computation based drug discovery. IEEE Transactions on Evolutionary Computation, 2024. [160] Xiangxiang Zeng, Fei Wang, Yuan Luo, Seung-gu Kang, Jian Tang, Felice C Lightstone, Evandro F Fang, Wendy Cornell, Ruth Nussinov, and Feixiong Cheng. Deep generative molecular design reshapes drug discovery. Cell Reports Medicine, 3(12), 2022. [161] Nicola De Cao and Thomas Kipf. Molgan: An implicit generative model for small molecular graphs. arXiv preprint arXiv:1805.11973, 2018. [162] Rafael Gómez-Bombarelli, Jennifer N Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D Hirzel, Ryan P Adams, and Alán Aspuru-Guzik. Automatic chemical design using a data-driven continuous representation of molecules. ACS central science, 4(2):268–276, 2018. [163] Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang. Graphaf: a flow-based autoregressive model for molecular graph generation. arXiv preprint arXiv:2001.09382, 2020. [164] Debjyoti Bhattacharya, Harrison J Cassady, Michael A Hickner, and Wesley F Reinhart. Large language models as molecular design engines. Journal of Chemical Information and Modeling, 64(18):7086–7096, 2024. [165] Hannes H Loeffler, Jiazhen He, Alessandro Tibo, Jon Paul Janet, Alexey Voronov, Lewis H Mervin, and Ola Engkvist. Reinvent 4: Modern ai–driven generative molecule design. Journal of Cheminformatics, 16(1):20, 2024. [166] David Mendez, Anna Gaulton, A Patrícia Bento, Jon Chambers, Marleen De Veij, Eloy Félix, María Paula Magariños, Juan F Mosquera, Prudence Mutowo, Michał Nowotka, et al. Chembl: towards direct deposition of bioassay data. Nucleic acids research, 47(D1):D930–D940, 2019. [167] Zongli Jiang, Zhenbo Wang, Jinli Zhang, Man Wub, Chen Li, and Yoshihiro Yamanishi. Mode collapse alleviation of reinforcement learning-based gans in drug design. In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 3045–3052. IEEE, 2023. [168] K Mukesh, Srisurya Ippatapu Venkata, Spandana Chereddy, E Anbazhagan, and IR Oviya. A variational autoencoder—general adversarial networks (vae-gan) based model for ligand designing. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2022, Volume 1, pages 761–768. Springer, 2022. [169] Benjamin I Tingle, Khanh G Tang, Mar Castanon, John J Gutierrez, Munkhzul Khurelbaatar, Chinzorig Dandarchuluun, Yurii S Moroz, and John J Irwin. Zinc- 22─ a free multi-billion-scale database of tangible compounds for ligand discovery. Journal of chemical information and modeling, 63(4):1166–1176, 2023. [170] Jerret Ross, Brian Belgodere, Samuel C Hoffman, Vijil Chenthamarakshan, Youssef Mroueh, and Payel Das. Gp-molformer: A foundation model for molecular generation. arXiv preprint arXiv:2405.04912, 2024. [171] Jie Yue, Bingxin Peng, Yu Chen, Jieyu Jin, Xinda Zhao, Chao Shen, Xiangyang Ji, Chang-Yu Hsieh, Jianfei Song, Tingjun Hou, et al. Unlocking comprehensive molecular design across all scenarios with large language model and unordered chemical language. Chemical Science, 15(34):13727–13740, 2024. [172] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning. Nature, 549(7671):195–202, 2017. [173] Raffaele Santagati, Alan Aspuru-Guzik, Ryan Babbush, Matthias Degroote, Leticia Gonzalez, Elica Kyoseva, Nikolaj Moll, Markus Oppel, Robert M Parrish, Nicholas C Rubin, et al. Drug design on quantum computers. Nature Physics, 20(4):549–557, 2024. [174] Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, Joseph C Bardin, Rami Barends, Rupak Biswas, Sergio Boixo, Fernando GSL Brandao, David A Buell, et al. Quantum supremacy using a programmable superconducting processor. Nature, 574(7779):505–510, 2019. [175] Su Yeon Chang, Supanut Thanasilp, Bertrand Le Saux, Sofia Vallecorsa, and Michele Grossi. Latent style-based quantum gan for high-quality image generation. arXiv preprint arXiv:2406.02668, 2024. [176] He-Liang Huang, Yuxuan Du, Ming Gong, Youwei Zhao, Yulin Wu, Chaoyue Wang, Shaowei Li, Futian Liang, Jin Lin, Yu Xu, et al. Experimental quantum generative adversarial networks for image generation. Physical Review Applied, 16(2):024051, 2021. [177] Daniel Silver, Tirthak Patel, William Cutler, Aditya Ranjan, Harshitta Gandhi, and Devesh Tiwari. Mosaiq: Quantum generative adversarial networks for image generation on nisq computers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7030–7039, 2023. [178] Matvei Anoshin, Asel Sagingalieva, Christopher Mansell, Dmitry Zhiganov, Vishal Shete, Markus Pflitsch, and Alexey Melnikov. Hybrid quantum cycle generative adversarial network for small molecule generation. IEEE Transactions on Quantum Engineering, 2024. [179] Po-Yu Kao, Ya-Chu Yang, Wei-Yin Chiang, Jen-Yueh Hsiao, Yudong Cao, Alex Aliper, Feng Ren, Alán Aspuru-Guzik, Alex Zhavoronkov, Min-Hsiu Hsieh, et al. Exploring the advantages of quantum generative adversarial networks in generative chemistry. Journal of Chemical Information and Modeling, 63(11):3307–3318, 2023. [180] Junde Li, Rasit O Topaloglu, and Swaroop Ghosh. Quantum generative models for small molecule drug discovery. IEEE transactions on quantum engineering, 2:1–8, 2021. [181] Jinli Zhang, Zhenbo Wang, Zongli Jiang, Man Wu, Chen Li, and Yoshihiro Yamanishi. Quantitative evaluation of molecular generation performance of graph-based gans. Software Quality Journal, pages 1–29, 2024. [182] Kun Fang, Munan Zhang, Ruqi Shi, and Yinan Li. Dynamic quantum circuit compilation. arXiv preprint arXiv:2310.11021, 2023. [183] Peter I Frazier. A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811, 2018. [184] Vladlmir Prelog and Günter Helmchen. Basic principles of the cip-system and proposals for a revision. Angewandte Chemie International Edition in English, 21(8):567–583, 1982. [185] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan. Variational quantum circuits for deep reinforcement learning. IEEE access, 8:141007–141024, 2020. [186] Carl Edward Rasmussen. Gaussian processes in machine learning. In Summer school on machine learning, pages 63–71. Springer, 2003. [187] Ax. adaptive experimentation platform, 2025. [188] Han Zheng, Christopher Kang, Gokul Subramanian Ravi, Hanrui Wang, Kanav Setia, Frederic T Chong, and Junyu Liu. Sncqa: A hardware-efficient equivariant quantum convolutional circuit architecture. In 2023 IEEE International Conference on Quantum Computing and Engineering (QCE), volume 1, pages 236–245. IEEE, 2023. [189] Anbang Wu, Gushu Li, Yuke Wang, Boyuan Feng, Yufei Ding, and Yuan Xie. Towards efficient ansatz architecture for variational quantum algorithms. arXiv preprint arXiv:2111.13730, 2021. [190] Fei Hua, Yuwei Jin, Yanhao Chen, Suhas Vittal, Kevin Krsulich, Lev S Bishop, John Lapeyre, Ali Javadi-Abhari, and Eddy Z Zhang. Exploiting qubit reuse through mid-circuit measurement and reset. arXiv preprint arXiv:2211.01925, 2022. [191] Kazuma Kaitoh and Yoshihiro Yamanishi. Scaffold-retained structure generator to exhaustively create molecules in an arbitrary chemical space. Journal of Chemical Information and Modeling, 62(9):2212–2225, 2022. [192] Maxime Langevin, Hervé Minoux, Maximilien Levesque, and Marc Bianciotto. Scaffold-constrained molecular generation. Journal of Chemical Information and Modeling, 60(12):5637–5646, 2020. [193] Chen Li and Yoshihiro Yamanishi. Spotgan: A reverse-transformer gan generates scaffold-constrained molecules with property optimization. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 323–338. Springer, 2023. [194] Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence, 1(5):206–215, 2019. [195] Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, et al. Molecular sets (moses): a benchmarking platform for molecular generation models. Frontiers in pharmacology, 11:565644, 2020. [196] Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. Self-referencing embedded strings (selfies): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024, 2020. [197] Chengxi Zang and Fei Wang. Moflow: an invertible flow model for generating molecular graphs. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 617–626, 2020. [198] Maria Schuld, Alex Bocharov, Krysta M Svore, and Nathan Wiebe. Circuit-centric quantum classifiers. Physical Review A, 101(3):032308, 2020. [199] Ibm quantum platform, 2025. [200] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, 2018. [201] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4):043001, 2019. [202] Marco Cerezo, Guillaume Verdon, Hsin-Yuan Huang, Lukasz Cincio, and Patrick J Coles. Challenges and opportunities in quantum machine learning. Nature Computational Science, 2(9):567–576, 2022. [203] Dongdong Chen, Yu Cheng, Lele Shi, Xueting Gao, Yuhang Huang, and Zhenting Du. Design, synthesis, and antimicrobial activity of amide derivatives containing cyclopropane. Molecules, 29(17):4124, 2024. [204] Shikha Kumari, Angelica V Carmona, Amit K Tiwari, and Paul C Trippier. Amide bond bioisosteres: Strategies, synthesis, and successes. Journal of medicinal chemistry, 63(21):12290–12358, 2020. [205] Arun K Ghosh and Margherita Brindisi. Urea derivatives in modern drug discovery and medicinal chemistry. Journal of medicinal chemistry, 63(6):2751–2788, 2019. [206] Bowen Li, Zhen Wang, Ziqi Liu, Yanxin Tao, Chulin Sha, Min He, and Xiaolin Li. Drugmetric: quantitative drug-likeness scoring based on chemical space distance. Briefings in Bioinformatics, 25(4):bbae321, 2024. [207] Seokho Kang and Kyunghyun Cho. Conditional molecular design with deep generative models. Journal of chemical information and modeling, 59(1):43–52, 2018. [208] Minjian Yang, Hanyu Sun, Xue Liu, Xi Xue, Yafeng Deng, and Xiaojian Wang. Cmgn: a conditional molecular generation net to design target-specific molecules with desired properties. Briefings in bioinformatics, 24(4):bbad185, 2023. [209] Myeonghun Lee and Kyoungmin Min. Mgcvae: multi-objective inverse design via molecular graph conditional variational autoencoder. Journal of chemical information and modeling, 62(12):2943–2950, 2022. [210] Robert Pollice, Gabriel dos Passos Gomes, Matteo Aldeghi, Riley J Hickman, Mario Krenn, Cyrille Lavigne, Michael Lindner-D'Addario, AkshatKumar Nigam, Cher Tian Ser, Zhenpeng Yao, et al. Data-driven strategies for accelerated materials design. Accounts of Chemical Research, 54(4):849–860, 2021. [211] Benjamin Sanchez-Lengeling and Alán Aspuru-Guzik. Inverse molecular design using machine learning: Generative models for matter engineering. Science, 361(6400):360–365, 2018. [212] Hendrik NJ Schifferstein and Lisa Wastiels. Sensing materials: Exploring the building blocks for experiential design. In Materials experience, pages 15–26. Elsevier, 2014. [213] Amy C Anderson. The process of structure-based drug design. Chemistry & biology, 10(9):787–797, 2003. [214] Xiaolong Zhang, Si-Xuan Guo, Karl A Gandionco, Alan M Bond, and Jie Zhang. Electrocatalytic carbon dioxide reduction: From fundamental principles to catalyst design. Materials Today Advances, 7:100074, 2020. [215] Jens Kehlet Nørskov, Thomas Bligaard, Jan Rossmeisl, and Claus Hviid Christensen. Towards the computational design of solid catalysts. Nature chemistry, 1(1):37–46, 2009. [216] Yuqiu Chen, Georgios M Kontogeorgis, and John M Woodley. Group contribution based estimation method for properties of ionic liquids. Industrial & Engineering Chemistry Research, 58(10):4277–4292, 2019. [217] Rafiqul Gani, Bjarne Nielsen, and Aage Fredenslund. A group contribution approach to computer-aided molecular design. AIChE Journal, 37(9):1318–1332, 1991. [218] Heiko Struebing, Zara Ganase, Panagiotis G Karamertzanis, Eirini Siougkrou, Peter Haycock, Patrick M Piccione, Alan Armstrong, Amparo Galindo, and Claire S Adjiman. Computer-aided molecular design of solvents for accelerated reaction kinetics. Nature chemistry, 5(11):952–957, 2013. [219] Abdulelah S Alshehri, Rafiqul Gani, and Fengqi You. Deep learning and knowledge-based methods for computer-aided molecular design—toward a unified approach: State-of-the-art and future directions. Computers & Chemical Engineering, 141:107005, 2020. [220] Oliver Wieder, Stefan Kohlbacher, Mélaine Kuenemann, Arthur Garon, Pierre Ducrot, Thomas Seidel, and Thierry Langer. A compact review of molecular property prediction with graph neural networks. Drug Discovery Today: Technologies, 37:1–12, 2020. [221] Vikas Garg. Generative ai for graph-based drug design: Recent advances and the way forward. Current Opinion in Structural Biology, 84:102769, 2024. [222] Jiacheng Xiong, Zhaoping Xiong, Kaixian Chen, Hualiang Jiang, and Mingyue Zheng. Graph neural networks for automated de novo drug design. Drug discovery today, 26(6):1382–1393, 2021. [223] Gabriele Corso, Hannes Stark, Stefanie Jegelka, Tommi Jaakkola, and Regina Barzilay. Graph neural networks. Nature Reviews Methods Primers, 4(1):17, 2024. [224] Austin H Cheng, Andy Cai, Santiago Miret, Gustavo Malkomes, Mariano Phielipp, and Alán Aspuru-Guzik. Group selfies: a robust fragment-based molecular string representation. Digital Discovery, 2(3):748–758, 2023. [225] Wengong Jin, Regina Barzilay, and Tommi Jaakkola. Junction tree variational autoencoder for molecular graph generation. In International conference on machine learning, pages 2323–2332. PMLR, 2018. [226] Karl Grantham, Muhetaer Mukaidaisi, Hsu Kiang Ooi, Mohammad Sajjad Ghaemi, Alain Tchagang, and Yifeng Li. Deep evolutionary learning for molecular design. IEEE Computational Intelligence Magazine, 17(2):14–28, 2022. [227] Yuki Matsukiyo, Chikashige Yamanaka, and Yoshihiro Yamanishi. De novo generation of chemical structures of inhibitor and activator candidates for therapeutic target proteins by a transformer-based variational autoencoder and bayesian optimization. Journal of Chemical Information and Modeling, 64(7):2345–2355, 2023. [228] Hiroaki Iwata, Taichi Nakai, Takuto Koyama, Shigeyuki Matsumoto, Ryosuke Kojima, and Yasushi Okuno. Vgae-mcts: A new molecular generative model combining the variational graph auto-encoder and monte carlo tree search. Journal of Chemical Information and Modeling, 63(23):7392–7400, 2023. [229] Jeff Guo and Philippe Schwaller. Augmented memory: Sample-efficient generative molecular design with reinforcement learning. Jacs Au, 2024. [230] Jan H Jensen. A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space. Chemical science, 10(12):3567– 3572, 2019. [231] Jonas Verhellen. Graph-based molecular pareto optimisation. Chemical Science, 13(25):7526–7535, 2022. [232] Domenico Alberga, Giuseppe Lamanna, Giovanni Graziano, Pietro Delre, Maria Cristina Lomuscio, Nicola Corriero, Alessia Ligresti, Dritan Siliqi, Michele Saviano, Marialessandra Contino, et al. Dela-drugself: Empowering multiobjective de novo design through selfies molecular representation. Computers in Biology and Medicine, 175:108486, 2024. [233] Naruki Yoshikawa, Kei Terayama, Masato Sumita, Teruki Homma, Kenta Oono, and Koji Tsuda. Population-based de novo molecule generation, using grammatical evolution. Chemistry Letters, 47(11):1431–1434, 2018. [234] Ana L Teixeira and Andre O Falcao. Structural similarity based kriging for quantitative structure activity and property relationship modeling. Journal of chemical information and modeling, 54(7):1833–1849, 2014. [235] Mengrui Jiang, Giulia Pedrielli, and Szu Hui Ng. Gaussian processes for highdimensional, large data sets: A review. In 2022 Winter Simulation Conference (WSC), pages 49–60. IEEE, 2022. [236] Volker L Deringer, Albert P Bartók, Noam Bernstein, David M Wilkins, Michele Ceriotti, and Gábor Csányi. Gaussian process regression for materials and molecules. Chemical Reviews, 121(16):10073–10141, 2021. [237] Henry B Moss and Ryan-Rhys Griffiths. Gaussian process molecule property prediction with flowmo. arXiv preprint arXiv:2010.01118, 2020. [238] Ksenia Korovina, Sailun Xu, Kirthevasan Kandasamy, Willie Neiswanger, Barnabas Poczos, Jeff Schneider, and Eric Xing. Chembo: Bayesian optimization of small organic molecules with synthesizable recommendations. In International Conference on Artificial Intelligence and Statistics, pages 3393–3403. PMLR, 2020. [239] Christopher KI Williams and Carl Edward Rasmussen. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006. [240] Jenna C Fromer, David E Graff, and Connor W Coley. Pareto optimization to accelerate multi-objective virtual screening. Digital Discovery, 3(3):467–481, 2024. [241] Derek van Tilborg and Francesca Grisoni. Traversing chemical space with active deep learning for low-data drug discovery. Nature Computational Science, 4(10):786–796, 2024. [242] Gabriele Scalia, Colin A Grambow, Barbara Pernici, Yi-Pei Li, and William H Green. Evaluating scalable uncertainty estimation methods for deep learning-based molecular property prediction. Journal of chemical information and modeling, 60(6):2697–2717, 2020. [243] Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U Rajendra Acharya, et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion, 76:243–297, 2021. [244] Lior Hirschfeld, Kyle Swanson, Kevin Yang, Regina Barzilay, and Connor W Coley. Uncertainty quantification using neural networks for molecular property prediction. Journal of Chemical Information and Modeling, 60(8):3770–3780, 2020. [245] Chu-I Yang and Yi-Pei Li. Explainable uncertainty quantifications for deep learning-based molecular property prediction. Journal of Cheminformatics, 15(1):13, 2023. [246] Daniel C Elton, Zois Boukouvalas, Mark D Fuge, and Peter W Chung. Deep learning for molecular design—a review of the state of the art. Molecular Systems Design & Engineering, 4(4):828–849, 2019. [247] Nathan Brown, Marco Fiscato, Marwin HS Segler, and Alain C Vaucher. Guacamol: benchmarking models for de novo molecular design. Journal of chemical information and modeling, 59(3):1096–1108, 2019. [248] AkshatKumar Nigam, Robert Pollice, Gary Tom, Kjell Jorner, John Willes, Luca Thiede, Anshul Kundaje, and Alán Aspuru-Guzik. Tartarus: A benchmarking platform for realistic and practical inverse molecular design. Advances in Neural Information Processing Systems, 36:3263–3306, 2023. [249] Esther Heid, Kevin P Greenman, Yunsie Chung, Shih-Cheng Li, David E Graff, Florence H Vermeire, Haoyang Wu, William H Green, and Charles J McGill. Chemprop: a machine learning package for chemical property prediction. Journal of Chemical Information and Modeling, 64(1):9–17, 2023. [250] Seongok Ryu, Yongchan Kwon, and Woo Youn Kim. A bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification. Chemical science, 10(36):8438–8446, 2019. [251] Megan A Lim, Song Yang, Huanghao Mai, and Alan C Cheng. Exploring deep learning of quantum chemical properties for absorption, distribution, metabolism, and excretion predictions. Journal of Chemical Information and Modeling, 62(24):6336–6341, 2022. [252] Gary Liu, Denise B Catacutan, Khushi Rathod, Kyle Swanson, Wengong Jin, Jody C Mohammed, Anush Chiappino-Pepe, Saad A Syed, Meghan Fragis, Kenneth Rachwalski, et al. Deep learning-guided discovery of an antibiotic targeting acinetobacter baumannii. Nature Chemical Biology, 19(11):1342–1350, 2023. [253] Yao Zhang et al. Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning. Chemical science, 10(35):8154–8163, 2019. [254] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016. [255] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017. [256] David A Nix and Andreas S Weigend. Estimating the mean and variance of the target probability distribution. In Proceedings of 1994 ieee international conference on neural networks (ICNN’94), volume 1, pages 55–60. IEEE, 1994. [257] Ava P Soleimany, Alexander Amini, Samuel Goldman, Daniela Rus, Sangeeta N Bhatia, and Connor W Coley. Evidential deep learning for guided molecular property prediction and discovery. ACS central science, 7(8):1356–1367, 2021. [258] Murat Sensoy, Lance Kaplan, and Melih Kandemir. Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31, 2018. [259] Eyke Hüllermeier and Willem Waegeman. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine learning, 110(3):457–506, 2021. [260] Annu Lambora, Kunal Gupta, and Kriti Chopra. Genetic algorithm-a literature review. In 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), pages 380–384. IEEE, 2019. [261] AkshatKumar Nigam, Robert Pollice, and Alán Aspuru-Guzik. Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design. Digital Discovery, 1(4):390–404, 2022. [262] David Weininger. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988. [263] Mario Krenn, Qianxiang Ai, Senja Barthel, Nessa Carson, Angelo Frei, Nathan C Frey, Pascal Friederich, Théophile Gaudin, Alberto Alexander Gayle, Kevin Maik Jablonka, et al. Selfies and the future of molecular string representations. Patterns, 3(10), 2022. [264] Stefan Grimme. Exploration of chemical compound, conformer, and reaction space with meta-dynamics simulations based on tight-binding quantum chemical calculations. Journal of chemical theory and computation, 15(5):2847–2862, 2019. [265] Christoph Bannwarth, Sebastian Ehlert, and Stefan Grimme. Gfn2-xtb—an accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions. Journal of chemical theory and computation, 15(3):1652–1671, 2019. [266] Stefan Grimme, Christoph Bannwarth, and Philip Shushkov. A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (z= 1–86). Journal of chemical theory and computation, 13(5):1989–2009, 2017. [267] Axel D Becke. Density-functional exchange-energy approximation with correct asymptotic behavior. Physical review A, 38(6):3098, 1988. [268] Amr Alhossary, Stephanus Daniel Handoko, Yuguang Mu, and Chee-Keong Kwoh. Fast, accurate, and reliable molecular docking with quickvina 2. Bioinformatics, 31(13):2214–2216, 2015. [269] Oleg Trott and Arthur J Olson. Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2):455–461, 2010. [270] Thomas A Halgren. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of computational chemistry, 17(5- 6):490–519, 1996. [271] Frank Jensen. Locating minima on seams of intersecting potential energy surfaces. an application to transition structure modeling. Journal of the American Chemical Society, 114(5):1596–1603, 1992. [272] Ronald Percy Bell. The theory of reactions involving proton transfers. Proceedings of the Royal Society of London. Series A-Mathematical and Physical Sciences, 154(882):414–429, 1936. [273] MG Evans and M Polanyi. Further considerations on the thermodynamics of chemical equilibria and reaction rates. Transactions of the Faraday Society, 32:1333– 1360, 1936. [274] Jordan E Crivelli-Decker, Zane Beckwith, Gary Tom, Ly Le, Sheenam Khuttan, Romelia Salomon-Ferrer, Jackson Beall, Rafael Gómez-Bombarelli, and Andrea Bortolato. Machine learning guided aqfep: a fast and efficient absolute free energy perturbation solution for virtual screening. Journal of Chemical Theory and Computation, 20(16):7188–7198, 2024. [275] Xinyao Xu, Wenlin Zhao, Liquan Wang, Jiaping Lin, and Lei Du. Efficient exploration of compositional space for high-performance copolymers via bayesian optimization. Chemical Science, 14(37):10203–10211, 2023. [276] Rinta Kawagoe, Tatsuhito Ando, Nobuyuki N Matsuzawa, Hiroyuki Maeshima, and Hiromasa Kaneko. Exploring molecular descriptors and acquisition functions in bayesian optimization for designing molecules with low hole reorganization energy. ACS omega, 9(49):48844–48854, 2024. [277] Chuan-Nan Li, Han-Pu Liang, Xie Zhang, Zijing Lin, and Su-Huai Wei. Graph deep learning accelerated efficient crystal structure search and feature extraction. npj Computational Materials, 9(1):176, 2023. [278] Lavi S Bigman and Yaakov Levy. Proteins: molecules defined by their trade-offs. Current opinion in structural biology, 60:50–56, 2020. [279] Yi-Hong Chen, Li-Yen Lin, Chih-Wei Lu, Francis Lin, Zheng-Yu Huang, Hao-Wu Lin, Po-Han Wang, Yi-Hung Liu, Ken-Tsung Wong, Jianguo Wen, et al. Vacuumdeposited small-molecule organic solar cells with high power conversion efficiencies by judicious molecular design and device optimization. Journal of the American Chemical Society, 134(33):13616–13623, 2012. [280] Qi Zhang, Abhishek Khetan, Elif Sorkun, and Süleyman Er. Discovery of azaaromatic anolytes for aqueous redox flow batteries via high-throughput screening. Journal of Materials Chemistry A, 10(41):22214–22227, 2022. [281] Kaisa Miettinen. Nonlinear multiobjective optimization, volume 12. Springer Science & Business Media, 1999. [282] Fredrik K Gustafsson, Martin Danelljan, and Thomas B Schon. Evaluating scalable bayesian deep learning methods for robust computer vision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 318–319, 2020. [283] Esther Heid, Charles J McGill, Florence H Vermeire, and William H Green. Characterizing uncertainty in machine learning for chemistry. Journal of Chemical Information and Modeling, 63(13):4012–4029, 2023. [284] Max-Heinrich Laves, Sontje Ihler, Jacob F Fast, Lüder A Kahrs, and Tobias Ortmaier. Recalibration of aleatoric and epistemic regression uncertainty in medical imaging. arXiv preprint arXiv:2104.12376, 2021. [285] Shengli Jiang, Shiyi Qin, Reid C Van Lehn, Prasanna Balaprakash, and Victor M Zavala. Uncertainty quantification for molecular property predictions with graph neural architecture search. Digital Discovery, 3(8):1534–1553, 2024. [286] Kexin Huang, Ying Jin, Emmanuel Candes, and Jure Leskovec. Uncertainty quantification over graph with conformalized graph neural networks. Advances in Neural Information Processing Systems, 36, 2024. [287] Yaniv Romano, Evan Patterson, and Emmanuel Candes. Conformalized quantile regression. Advances in neural information processing systems, 32, 2019. [288] Tatsuhito Ando, Naoto Shimizu, Norihisa Yamamoto, Nobuyuki N Matsuzawa, Hiroyuki Maeshima, and Hiromasa Kaneko. Design of molecules with low hole and electron reorganization energy using dft calculations and bayesian optimization. The Journal of Physical Chemistry A, 126(36):6336–6347, 2022. [289] Riley J Hickman, Matteo Aldeghi, Florian Häse, and Alán Aspuru-Guzik. Bayesian optimization with known experimental and design constraints for chemistry applications. Digital Discovery, 1(5):732–744, 2022. [290] Anirudh MK Nambiar, Christopher P Breen, Travis Hart, Timothy Kulesza, Timothy F Jamison, and Klavs F Jensen. Bayesian optimization of computer-proposed multistep synthetic routes on an automated robotic flow platform. ACS Central Science, 8(6):825–836, 2022. [291] Benjamin J Shields, Jason Stevens, Jun Li, Marvin Parasram, Farhan Damani, Jesus I Martinez Alvarado, Jacob M Janey, Ryan P Adams, and Abigail G Doyle. Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590(7844):89–96, 2021. [292] Lung-Yi Chen, Ting-Wei Hsu, Tsai-Chen Hsiung, and Yi-Pei Li. Deep learningbased increment theory for formation enthalpy predictions. The Journal of Physical Chemistry A, 126(41):7548–7556, 2022. [293] David E Graff, Eugene I Shakhnovich, and Connor W Coley. Accelerating high-throughput virtual screening through molecular pool-based active learning. Chemical science, 12(22):7866–7881, 2021. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97537 | - |
| dc.description.abstract | 隨著人工智慧技術的蓬勃發展,化學合成與分子設計領域正經歷前所未有的變革。本論文深入探討了人工智慧、機器學習和量子計算在計算化學中的整合應用,針對藥物發現、材料科學、催化劑設計和綠色化學等關鍵領域的核心挑戰提出了系統性的解決方案。
作為整個研究框架的基石,本研究首先開發了AutoTemplate 系統性數據清理套件,有效解決化學反應數據庫中常見的反應機構標示錯誤、反應物缺失和錯誤記錄等問題。該套件不僅修正了數據品質缺陷,更為後續所有機器學習模型的開發奠定了可靠的數據基礎。而數據保留率在不同反應類型中達到62.3% 至98.2%不等,確保了下游化學應用建模的準確性和可靠性。 建立在高品質數據基礎上,本研究進一步提出了創新的兩階段深度學習框架用於反應條件設計。該方法首先透過多任務神經網絡生成潛在可能的試劑和溶劑,再運用排序優化技術根據預期產率對完整反應條件組合進行排序。在包含Buchwald-Hartwig 偶聯、Suzuki 反應等十種重要反應類型的測試中,該模型在Top-10 準確率上達到73%。其能夠快速地提供並篩選出可能的反應條件組合,有望降低製藥工業和精細化學品生產中的實驗試錯成本,並加速實驗室的反應製程開發。 為支援分子設計中的快速篩選需求,本研究發展了基於圖神經網絡的加成理論,專門用於預測分子熱化學性質,特別是生成焓。透過本研究所提出的原子指紋特徵技術和結合CCSD(T)-F12a 及NIST 實驗數據的遷移學習策略,該模型在生成焓預測上達到2.04 kcal/mol 的平均絕對誤差,大幅超越傳統基團貢獻法的預測精度。這一模型方法能夠為能源材料設計、新藥候選化合物篩選以及環境友善溶劑開發提供了準確的性質評估工具,並且輔助昂貴且耗時的量子力學計算。 在分子設計中的生成式模型部分,本研究提出了量子分子生成器(QMG),利用動態量子電路和化學啟發的變分量子線路設計,探索九個重原子的化學空間。該系統僅使用134 個可訓練參數,卻能生成化學上有效且多樣的分子結構。而評估生成式模型指標Validity 和Uniqueness 均超過90%。這項技術特別適用於探索傳統方法難以觸及的化學空間,潛在地為抗癌藥物設計、新型催化劑發現以及碳捕獲材料的開發開闢了新的可能性,展現了量子計算在分子設計中的潛力。 整合前述所有技術成果,本論文最終發展了不確定性分子優化策略,將圖神經網絡預測能力、遺傳演算法搜索策略以及生成式模型方法結合,引入概率改進優化(PIO)方法。在涵蓋GuacaMol 和Tartarus 分子設計平台的19 項分子設計任務全面測試中,該整合方法在10 項單目標優化中的8 項以及6 項多目標優化中的5 項均優於傳統優化方法,顯著提升了多目標或單目標分子設計的穩健性。這一成果能夠應用於多種場景,包含有機發光材料、蛋白質配體、反應底物設計和藥物設計。 本論文整體來說建立了從數據基礎、反應條件預測、分子性質評估、量子生成探索到不確定性優化的完整技術鏈,形成了相互支撐、層層遞進的研究體系。這些研究成果充分證明了人工智慧、機器學習和量子計算在化學研究中的應用潛力,不僅為製藥工業的新藥開發、材料科學的功能材料設計、以及綠色化學的永續製程優化提供了強大的工具,更為計算化學向自動化、智能化方向發展描繪了清晰的藍圖。這些技術的整合應用將持續推動化學科學向更高效、更精確、更永續的方向發展。 | zh_TW |
| dc.description.abstract | With the rapid advancement of artificial intelligence technologies, the fields of chemical synthesis and molecular design are experiencing unprecedented transformation. This dissertation thoroughly investigates the integrated applications of artificial intelligence, machine learning, and quantum computing in computational chemistry, proposing systematic solutions to core challenges in key areas including drug discovery, materials science, catalyst design, and green chemistry.
As the cornerstone of the entire research framework, this study first developed the AutoTemplate systematic data cleaning suite, effectively addressing common issues in chemical reaction databases such as incorrect reaction mechanism annotations, missing reactants, and erroneous records. This suite not only rectifies data quality defects but also establishes a reliable data foundation for subsequent machine learning model development. Data retention rates across different reaction types range from 62.3% to 98.2%, ensuring the accuracy and reliability of downstream chemical application modeling. Building upon this high-quality data foundation, this research further proposes an innovative two-stage deep learning framework for reaction condition design. This approach first generates potentially viable reagents and solvents through multi-task neural networks, then employs ranking optimization techniques to prioritize complete reaction condition combinations based on expected yields. Testing on ten important reaction types, including Buchwald-Hartwig coupling and Suzuki reactions, the model achieved 73% accuracy in Top-10 predictions. It can rapidly provide and screen potential reaction condition combinations, promising to reduce experimental trial-and-error costs in pharmaceutical industry and fine chemical production while accelerating laboratory reaction process development. To support rapid screening requirements in molecular design, this study developed a graph neural network-based additive theory specifically for predicting molecular thermochemical properties, particularly enthalpies of formation. Through the proposed atomic fingerprint feature technique and transfer learning strategy combining CCSD(T)-F12a and NIST experimental data, the model achieved a mean absolute error of 2.04 kcal/mol in enthalpy of formation prediction, substantially surpassing the predictive accuracy of traditional group contribution methods. This model approach provides accurate property assessment tools for energy materials design, new drug candidate screening, and environmentally friendly solvent development, while assisting expensive and time-consuming quantum mechanical calculations. In the generative modeling component of molecular design, this research proposes the Quantum Molecular Generator (QMG), utilizing dynamic quantum circuits and chemistryinspired variational quantum circuit design to explore the chemical space of nine heavy atoms. This system generates chemically valid and diverse molecular structures using only 134 trainable parameters. The generative model evaluation metrics of Validity and Uniqueness both exceed 90%. This technology is particularly suitable for exploring chemical spaces difficult to access through traditional methods, potentially opening new possibilities for anticancer drug design, novel catalyst discovery, and carbon capture material development, demonstrating the potential of quantum computing in molecular design. Integrating all aforementioned technological achievements, this dissertation ultimately develops an uncertainty-aware molecular optimization strategy that combines graph neural network predictive capabilities, genetic algorithm search strategies, and generative model approaches, introducing the Probability of Improvement Optimization (PIO) method. In comprehensive testing across 19 molecular design tasks covering GuacaMol and Tartarus molecular design platforms, this integrated approach outperformed traditional optimization methods in 8 out of 10 single-objective optimizations and 5 out of 6 multiobjective optimizations, significantly enhancing the robustness of multi-objective and single-objective molecular design. These results can be applied to various scenarios, including organic light-emitting materials, protein ligands, reaction substrate design, and drug design. Overall, this dissertation establishes a complete technological chain from data foundation, reaction condition prediction, molecular property assessment, quantum generative exploration, to uncertainty optimization, forming a mutually supportive and progressively advancing research system. These research achievements fully demonstrate the application potential of artificial intelligence, machine learning, and quantum computing in chemical research, not only providing powerful tools for new drug development in pharmaceutical industry, functional materials design in materials science, and sustainable process optimization in green chemistry, but also depicting a clear blueprint for the development of computational chemistry toward automation and intelligence. The integrated application of these technologies will continue to drive chemical science toward more efficient, accurate, and sustainable directions. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-02T16:21:26Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-07-02T16:21:26Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 vii 摘要 xi Contents xv List of Figures xxi List of Tables xxxv Chapter 1 Introduction 1 1.1 Goals 1 1.2 Thesis Organization 1 Chapter 2 Data Curation for Machine Learning in Chemical Synthesis 7 2.1 Introduction 7 2.2 Related works 12 2.3 Methods 13 2.3.1 Generic template extraction 14 2.3.1.1 Reaction data collection 14 2.3.1.2 Atom-to-atom mapping 15 2.3.1.3 Generic template definition and extraction 16 2.3.1.4 Template canonicalization 18 2.3.1.5 Removal of rare templates 20 2.3.2 Template-guided curation 21 2.3.2.1 Template application procedure 21 2.3.2.2 Append atomic chirality and bond stereochemistry 21 2.3.2.3 Reaction balance with addition of by-products 22 2.4 Results and Discussion 23 2.4.1 Analysis of overall results 23 2.4.2 Examples of correcting atom-mapping errors 25 2.4.3 Examples of addressing missing reactant errors 27 2.4.4 Examples of identifying and resolving erroneous reactions 29 2.5 Conclusions 32 Chapter 3 Machine Learning for Reaction Condition Prediction 35 3.1 Introduction 35 3.2 Related works 37 3.2.1 Reaction Representations for Modeling Reactions 38 3.3 Methods 40 3.3.1 Data preprocessing 40 3.3.2 Model Setup 42 3.3.3 Data Augmentation via Hard Negative Sampling 45 3.3.4 Evaluation metrics 47 3.4 Results and discussion 49 3.4.1 Threshold optimization for candidate generation model 49 3.4.2 Performance of the two-stage model 51 3.4.3 Evaluation of Recommended Reaction Conditions 53 3.4.4 Unsupervised Reaction Classification via Learned Representations 56 3.5 Conclusions 58 Chapter 4 Graph Neural Networks for Molecular Property Prediction 61 4.1 Introduction 61 4.2 Methods 64 4.2.1 Model Architecture 64 4.2.2 Molecular Fingerprint Representations 64 4.2.3 Atomic Fingerprint Representations 65 4.2.4 Hybrid Fingerprint Representations 66 4.2.4.1 Hybrid Fingerprint (D0) 66 4.2.4.2 Hybrid Fingerprint (D1) 67 4.2.5 Summary of Model Variants 67 4.3 Data Preparation and Transfer Learning 68 4.4 Results and discussions 70 4.4.1 Comparison of atomic, molecular, and hybrid fingerprint models 70 4.4.2 Comparison to BGIT and TCIT methods 74 4.4.3 Visualization of Atomic Contributions 78 4.5 Conclusions 80 Chapter 5 Quantum Computing-Based Molecular Generator 83 5.1 Introduction 83 5.2 Methods 86 5.2.1 Post-processing of 2D molecular graphs with stereochemical information 86 5.2.2 Bayesian optimization of QMG parameters 87 5.3 Results and discussion 88 5.3.1 Design and implementation of the quantum dynamic circuit ansatz 90 5.3.2 Scaffold-retained generation for molecular design 93 5.3.3 Benchmarking molecular generation performance among classical, hybrid, and quantum models 94 5.3.4 Scaffold-retained generation of molecules with enhanced quantitative estimate of drug-likeness (QED) scores 97 5.4 Molecular generation conditionally on molecular properties 101 5.5 Conclusions 104 Chapter 6 Uncertainty-Aware Optimization for Efficient Molecular Design 107 6.1 Introduction 107 6.2 Methods 112 6.2.1 Surrogate models and uncertainty quantification 112 6.2.2 Genetic algorithm for molecular optimization 116 6.2.3 Computational details 116 6.3 Results and discussions 118 6.3.1 Molecular Design Benchmarks 118 6.3.2 Uncertainty-aware and uncertainty-agnostic fitness functions 120 6.3.3 Surrogate model and UQ performance 124 6.3.4 Optimization results of single-objective task 126 6.4 Optimization Results of Multi-Objective Tasks 134 6.5 Conclusions 139 Chapter 7 Conclusions 143 7.1 Scientific contributions 143 7.2 Current Limitations and Challenges 145 7.2.1 Data Quality and Availability Challenges 146 7.2.2 Scalability and Generalization Limitations 146 7.2.3 Quantum Computing Implementation Challenges 147 7.2.4 Uncertainty Quantification and Model Reliability 148 7.3 Future Research Directions and Opportunities 149 7.3.1 Integration of Multi-Scale Modeling Approaches 149 7.3.2 Advanced Quantum Computing Applications 149 7.3.3 Automated Experimental Integration and Closed-Loop Discovery 150 7.3.4 Interpretable AI and Chemical Understanding 151 7.3.5 Sustainable Chemistry and Green AI 151 7.3.6 Cross-Disciplinary Integration and Emerging Applications 152 7.4 Final Reflections 152 References 155 Appendix A — Data curation for machine learning in chemical synthesis 197 A.1 Data Availability 197 A.2 Analysis of generic and aromaticity-informed templates 198 A.3 The impact of dataset size on data curation outcomes 202 A.4 Recovery of by-products 202 A.5 Challenges of competing templates in restoring missing reactants 204 Appendix B — Machine Learning for Reaction Condition Prediction 207 B.1 Data preparation and code availability 207 B.2 Analysis of reagent and solvent distributions 207 B.3 Examination of reaction temperatures and yields 208 B.4 Hyperparameter tuning 211 Appendix C — Graph Neural Networks for Molecular Property Prediction 213 C.1 Feature selection 213 C.2 Directed message passing neural network (D-MPNN) 213 C.3 Hyperparameter settings 215 C.4 Extrapolation to larger molecules 218 Appendix D — Quantum-based molecular generator (QMG) 219 D.1 Pseudocode for Constructing Quantum-based Molecular Generator (QMG) 219 D.2 Post-processing of Atomic Chirality 221 D.3 Post-processing of Bond Stereoisomerism 222 Appendix E — Uncertainty-Aware Optimization for Efficient Molecular Design 227 E.1 Assessing aleatoric uncertainty in tartarus datasets 227 E.2 Penalty settings in molecular design 228 E.3 Residual error correlation in multi-objective tasks 230 | - |
| dc.language.iso | en | - |
| dc.subject | 化學反應預測 | zh_TW |
| dc.subject | 量子計算 | zh_TW |
| dc.subject | 機器學習 | zh_TW |
| dc.subject | 分子設計 | zh_TW |
| dc.subject | 反應數據前處理 | zh_TW |
| dc.subject | Reaction data preprocessing | en |
| dc.subject | Quantum computing | en |
| dc.subject | Reaction condition prediction | en |
| dc.subject | Machine learning | en |
| dc.subject | Molecular design | en |
| dc.title | 電腦輔助分子設計與反應條件優化:基於人工智慧的化學合成策略 | zh_TW |
| dc.title | Computer-Aided Molecular Design and Reaction Condition Optimization: Artificial Intelligence-Based Chemical Synthesis Strategies | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 博士 | - |
| dc.contributor.oralexamcommittee | 林祥泰;林立強;楊延齡;陳南佑;李泰岳 | zh_TW |
| dc.contributor.oralexamcommittee | Shiang-Tai Lin;Li-Chiang Lin;Yan-Ling Yang;Nan-You Chen;Tai-Yue Li | en |
| dc.subject.keyword | 機器學習,化學反應預測,分子設計,反應數據前處理,量子計算, | zh_TW |
| dc.subject.keyword | Machine learning,Reaction condition prediction,Molecular design,Reaction data preprocessing,Quantum computing, | en |
| dc.relation.page | 231 | - |
| dc.identifier.doi | 10.6342/NTU202500739 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-06-17 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 化學工程學系 | - |
| dc.date.embargo-lift | 2025-07-03 | - |
| 顯示於系所單位: | 化學工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 31.85 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
