Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 理學院
  3. 化學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100987
標題: 利用混合量子化學方法少樣本預測小分子碰撞誘導解離串聯質譜
Few-Shot Prediction of Small Molecule Collision-Induced Dissociation Tandem Mass Spectra Using Hybrid Quantum Chemical Methods
作者: 邵偉凱
Wei-Kai Shao
指導教授: 徐丞志
Cheng-Chih Hsu
關鍵字: 碰撞誘導解離(CID),串聯質譜(MS/MS)化學結構鑑定(CSI)量子化學(QC)少樣本預測
Collision-induced dissociation (CID),Tandem mass spectrometry (MS/MS)Chemical structural identification (CSI)Quantum chemistry (QC)Few-shot prediction
出版年 : 2025
學位: 碩士
摘要: 碰撞誘導解離串聯質譜(CID-MS/MS)是未知小分子化學結構鑑定(CSI)的重要工具。然而,在非鏢向研究中,多達98%的光譜仍未被註釋。基於機器學習(ML)的CSI軟體效能常受限於訓練資料稀缺與儀器特異性,導致十年來Top-1鑑定的準確率停滯不前。一個有前景的解決方案是通過模擬多樣與罕見的化合物光譜來擴展資料集。

在本研究中,我們提出Hybrid QC,一個基於物理的混合量子化學方法,能夠實現完全自動化、端到端的ESI-CID-MS/MS光譜預測。該方法僅需少量數據校準即可適應特定儀器,確保跨不同實驗設置的靈活性。對涵蓋了CASMI 2022資料集中95.4%化學結構超類別的40個分子進行測試時,Hybrid QC在三個歸一化碰撞能量下實現44.51%的平均餘弦相似度,較最先進的ML模型CFM-ID 4.0相對提升13.6%。Hybrid QC在具有剛性骨架、高雜原子含量與強分子內氫鍵的化合物上有明顯的預測優勢。在高碰撞能量下預測表現也特別優異,得益於其能準確模擬複雜的斷裂機制。

雖然將Hybrid QC整合到基於SIRIUS的重排序流程中顯示,改進光譜預測的精準度本身未必能直接提升Top-1鑑定率,Hybrid QC仍然為有效的候選優先排序提供了可解釋性的基礎。更重要的是,其基於物理的基礎使得光譜預測不再受限於可用實驗資料、檢測窗口與儀器差異的限制。通過克服ML方法面臨的資料稀缺等瓶頸,Hybrid QC為擴展CID-MS/MS資料庫與在複雜組學研究領域中鑑定多樣且罕見的化合物提供了強大且可擴展的平台。
Collision-induced dissociation tandem mass spectrometry (CID-MS/MS) is an essential tool for the chemical structural identification (CSI) of unknown small molecules. However, up to 98% of spectra in untargeted studies remain unannotated. Machine learning (ML)-based software for CSI often suffers from training data scarcity and instrument-specific variability, leading to a decade-long stagnation in Top-1 identification accuracy. One promising solution is to expand datasets by simulating spectra of chemically diverse and rare compounds.

Here, we present Hybrid QC, a physics-based hybrid quantum chemistry workflow that enables fully automated, end-to-end ESI-CID-MS/MS spectral prediction. The protocol requires only a few-shot calibration to adapt to specific instruments, ensuring flexible application across different experimental setups. When benchmarked on a curated test set of 40 molecules covering 95.4% of CASMI 2022 dataset chemical superclasses, Hybrid QC achieved an average cosine similarity of 44.51% across three normalized collision energies, representing a 13.6% relative improvement compared to the state-of-the-art ML model CFM-ID 4.0. Hybrid QC demonstrated particular strengths for compounds with rigid frameworks, high heteroatom content, and strong intramolecular hydrogen bonding, and was especially effective at high collision energies, accurately modeling complex fragmentation mechanisms.

Although the integration of Hybrid QC into a SIRIUS-based re-ranking workflow demonstrates that improved spectral predictions alone do not always lead to higher top-1 identification rates, Hybrid QC nonetheless provides an interpretable foundation for effective candidate prioritization. Importantly, its physics-based foundation enables spectral prediction unconstrained by the limitations of available experimental data, detection windows, and instrument variability. By overcoming the data scarcity bottleneck faced by ML-based approaches, Hybrid QC offers a robust and scalable platform for spectral database expansion and the reliable identification of chemically diverse and rare compounds across complex omics research domains.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100987
DOI: 10.6342/NTU202504503
全文授權: 未授權
電子全文公開日期: N/A
顯示於系所單位:化學系

文件中的檔案:
檔案 大小格式 
ntu-114-1.pdf
  未授權公開取用
7.09 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved