Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 化學工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97537
標題: 電腦輔助分子設計與反應條件優化:基於人工智慧的化學合成策略
Computer-Aided Molecular Design and Reaction Condition Optimization: Artificial Intelligence-Based Chemical Synthesis Strategies
作者: 陳隆奕
Lung-Yi Chen
指導教授: 李奕霈
Yi-Pei Li
關鍵字: 機器學習,化學反應預測,分子設計,反應數據前處理,量子計算,
Machine learning,Reaction condition prediction,Molecular design,Reaction data preprocessing,Quantum computing,
出版年 : 2025
學位: 博士
摘要: 隨著人工智慧技術的蓬勃發展,化學合成與分子設計領域正經歷前所未有的變革。本論文深入探討了人工智慧、機器學習和量子計算在計算化學中的整合應用,針對藥物發現、材料科學、催化劑設計和綠色化學等關鍵領域的核心挑戰提出了系統性的解決方案。
作為整個研究框架的基石,本研究首先開發了AutoTemplate 系統性數據清理套件,有效解決化學反應數據庫中常見的反應機構標示錯誤、反應物缺失和錯誤記錄等問題。該套件不僅修正了數據品質缺陷,更為後續所有機器學習模型的開發奠定了可靠的數據基礎。而數據保留率在不同反應類型中達到62.3% 至98.2%不等,確保了下游化學應用建模的準確性和可靠性。
建立在高品質數據基礎上,本研究進一步提出了創新的兩階段深度學習框架用於反應條件設計。該方法首先透過多任務神經網絡生成潛在可能的試劑和溶劑,再運用排序優化技術根據預期產率對完整反應條件組合進行排序。在包含Buchwald-Hartwig 偶聯、Suzuki 反應等十種重要反應類型的測試中,該模型在Top-10 準確率上達到73%。其能夠快速地提供並篩選出可能的反應條件組合,有望降低製藥工業和精細化學品生產中的實驗試錯成本,並加速實驗室的反應製程開發。
為支援分子設計中的快速篩選需求,本研究發展了基於圖神經網絡的加成理論,專門用於預測分子熱化學性質,特別是生成焓。透過本研究所提出的原子指紋特徵技術和結合CCSD(T)-F12a 及NIST 實驗數據的遷移學習策略,該模型在生成焓預測上達到2.04 kcal/mol 的平均絕對誤差,大幅超越傳統基團貢獻法的預測精度。這一模型方法能夠為能源材料設計、新藥候選化合物篩選以及環境友善溶劑開發提供了準確的性質評估工具,並且輔助昂貴且耗時的量子力學計算。
在分子設計中的生成式模型部分,本研究提出了量子分子生成器(QMG),利用動態量子電路和化學啟發的變分量子線路設計,探索九個重原子的化學空間。該系統僅使用134 個可訓練參數,卻能生成化學上有效且多樣的分子結構。而評估生成式模型指標Validity 和Uniqueness 均超過90%。這項技術特別適用於探索傳統方法難以觸及的化學空間,潛在地為抗癌藥物設計、新型催化劑發現以及碳捕獲材料的開發開闢了新的可能性,展現了量子計算在分子設計中的潛力。
整合前述所有技術成果,本論文最終發展了不確定性分子優化策略,將圖神經網絡預測能力、遺傳演算法搜索策略以及生成式模型方法結合,引入概率改進優化(PIO)方法。在涵蓋GuacaMol 和Tartarus 分子設計平台的19 項分子設計任務全面測試中,該整合方法在10 項單目標優化中的8 項以及6 項多目標優化中的5 項均優於傳統優化方法,顯著提升了多目標或單目標分子設計的穩健性。這一成果能夠應用於多種場景,包含有機發光材料、蛋白質配體、反應底物設計和藥物設計。
本論文整體來說建立了從數據基礎、反應條件預測、分子性質評估、量子生成探索到不確定性優化的完整技術鏈,形成了相互支撐、層層遞進的研究體系。這些研究成果充分證明了人工智慧、機器學習和量子計算在化學研究中的應用潛力,不僅為製藥工業的新藥開發、材料科學的功能材料設計、以及綠色化學的永續製程優化提供了強大的工具,更為計算化學向自動化、智能化方向發展描繪了清晰的藍圖。這些技術的整合應用將持續推動化學科學向更高效、更精確、更永續的方向發展。
With the rapid advancement of artificial intelligence technologies, the fields of chemical synthesis and molecular design are experiencing unprecedented transformation. This dissertation thoroughly investigates the integrated applications of artificial intelligence, machine learning, and quantum computing in computational chemistry, proposing systematic solutions to core challenges in key areas including drug discovery, materials science, catalyst design, and green chemistry.
As the cornerstone of the entire research framework, this study first developed the AutoTemplate systematic data cleaning suite, effectively addressing common issues in chemical reaction databases such as incorrect reaction mechanism annotations, missing reactants, and erroneous records. This suite not only rectifies data quality defects but also establishes a reliable data foundation for subsequent machine learning model development. Data retention rates across different reaction types range from 62.3% to 98.2%, ensuring the accuracy and reliability of downstream chemical application modeling.
Building upon this high-quality data foundation, this research further proposes an innovative two-stage deep learning framework for reaction condition design. This approach first generates potentially viable reagents and solvents through multi-task neural networks, then employs ranking optimization techniques to prioritize complete reaction condition combinations based on expected yields. Testing on ten important reaction types, including Buchwald-Hartwig coupling and Suzuki reactions, the model achieved 73% accuracy in Top-10 predictions. It can rapidly provide and screen potential reaction condition combinations, promising to reduce experimental trial-and-error costs in pharmaceutical industry and fine chemical production while accelerating laboratory reaction process development.
To support rapid screening requirements in molecular design, this study developed a graph neural network-based additive theory specifically for predicting molecular thermochemical properties, particularly enthalpies of formation. Through the proposed atomic fingerprint feature technique and transfer learning strategy combining CCSD(T)-F12a and NIST experimental data, the model achieved a mean absolute error of 2.04 kcal/mol in enthalpy of formation prediction, substantially surpassing the predictive accuracy of traditional group contribution methods. This model approach provides accurate property assessment tools for energy materials design, new drug candidate screening, and environmentally friendly solvent development, while assisting expensive and time-consuming quantum mechanical calculations.
In the generative modeling component of molecular design, this research proposes the Quantum Molecular Generator (QMG), utilizing dynamic quantum circuits and chemistryinspired variational quantum circuit design to explore the chemical space of nine heavy atoms. This system generates chemically valid and diverse molecular structures using
only 134 trainable parameters. The generative model evaluation metrics of Validity and Uniqueness both exceed 90%. This technology is particularly suitable for exploring chemical spaces difficult to access through traditional methods, potentially opening new possibilities for anticancer drug design, novel catalyst discovery, and carbon capture material development, demonstrating the potential of quantum computing in molecular design.
Integrating all aforementioned technological achievements, this dissertation ultimately develops an uncertainty-aware molecular optimization strategy that combines graph neural network predictive capabilities, genetic algorithm search strategies, and generative model approaches, introducing the Probability of Improvement Optimization (PIO) method. In comprehensive testing across 19 molecular design tasks covering GuacaMol and Tartarus molecular design platforms, this integrated approach outperformed traditional optimization methods in 8 out of 10 single-objective optimizations and 5 out of 6 multiobjective optimizations, significantly enhancing the robustness of multi-objective and single-objective molecular design. These results can be applied to various scenarios, including organic light-emitting materials, protein ligands, reaction substrate design, and drug design.
Overall, this dissertation establishes a complete technological chain from data foundation, reaction condition prediction, molecular property assessment, quantum generative exploration, to uncertainty optimization, forming a mutually supportive and progressively advancing research system. These research achievements fully demonstrate the application potential of artificial intelligence, machine learning, and quantum computing in chemical research, not only providing powerful tools for new drug development in pharmaceutical industry, functional materials design in materials science, and sustainable process optimization in green chemistry, but also depicting a clear blueprint for the development of computational chemistry toward automation and intelligence. The integrated application of these technologies will continue to drive chemical science toward more efficient, accurate, and sustainable directions.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97537
DOI: 10.6342/NTU202500739
全文授權: 同意授權(全球公開)
電子全文公開日期: 2025-07-03
顯示於系所單位:化學工程學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf31.85 MBAdobe PDF檢視/開啟
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved