請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78764
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 李妮鍾(Ni-Chung Lee) | |
dc.contributor.author | Chin-Shiang Ma | en |
dc.contributor.author | 馬欽祥 | zh_TW |
dc.date.accessioned | 2021-07-11T15:17:42Z | - |
dc.date.available | 2021-08-28 | |
dc.date.copyright | 2019-08-28 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-07-19 | |
dc.identifier.citation | Baumgartner, M. R., et al. (2014). 'Proposed guidelines for the diagnosis and management of methylmalonic and propionic acidemia.' Orphanet J Rare Dis 9: 130.
Bertini, I., et al. (2009). 'The metabonomic signature of celiac disease.' J Proteome Res 8(1): 170-177. Chang, C.-C. and C.-J. Lin (2011). 'Libsvm.' ACM Transactions on Intelligent Systems and Technology 2(3): 1-27. Cortes, C. and V. Vapnik (1995). 'Support-Vector Networks.' Machine Learning 20(3): 273-297. Cuperlovic-Culf, M. (2018). 'Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling.' Metabolites 8(1). Date, Y. and J. Kikuchi (2018). 'Application of a Deep Neural Network to Metabolomics Studies and Its Performance in Determining Important Variables.' Anal Chem 90(3): 1805-1810. Dionisi-Vici, C., et al. (2006). ''Classical' organic acidurias, propionic aciduria, methylmalonic aciduria and isovaleric aciduria: long-term outcome and effects of expanded newborn screening using tandem mass spectrometry.' J Inherit Metab Dis 29(2-3): 383-389. Fathi, F., et al. (2014). '1H NMR based metabolic profiling in Crohn's disease by random forest methodology.' Magn Reson Chem 52(7): 370-376. Gomes, H. M., et al. (2017). 'Adaptive random forests for evolving data stream classification.' Machine Learning 106(9): 1469-1495. Hiller, K., et al. (2009). 'MetaboliteDetector: comprehensive analysis tool for targeted and nontargeted GC/MS based metabolome analysis.' Anal Chem 81(9): 3429-3439. Kimura, M., et al. (1999). 'Automated metabolic profiling and interpretation of GC/MS data for organic acidemia screening: a personal computer-based system.' Tohoku J Exp Med 188(4): 317-334. Mahadevan, S., et al. (2008). 'Analysis of metabolomic data using support vector machines.' Anal Chem 80(19): 7562-7570. McGarry, K., et al. (2008). Exploratory Data Analysis for Investigating GC-MS Biomarkers, Berlin, Heidelberg, Springer Berlin Heidelberg. Naccarato, A., et al. (2014). 'A fast and simple solid phase microextraction coupled with gas chromatography-triple quadrupole mass spectrometry method for the assay of urinary markers of glutaric acidemias.' J Chromatogr A 1372c: 253-259. Ravera, E., et al. (2014). 'DNP-Enhanced MAS NMR of Bovine Serum Albumin Sediments and Solutions.' The Journal of Physical Chemistry B 118(11): 2957-2965. Simonyan, K. and A. Zisserman (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. Skarysz, A., et al. (2018). Convolutional neural networks for automated targeted analysis of raw gas chromatography-mass spectrometry data. 2018 International Joint Conference on Neural Networks (IJCNN). Spiekerkoetter, U., et al. (2010). 'Current issues regarding treatment of mitochondrial fatty acid oxidation disorders.' J Inherit Metab Dis 33(5): 555-561. Stanley, C. A., et al. (2006). Disorders of Mitochondrial Fatty Acid Oxidation and Related Metabolic Pathways. Inborn Metabolic Diseases: Diagnosis and Treatment. J. Fernandes, J.-M. Saudubray, G. van den Berghe and J. H. Walter. Berlin, Heidelberg, Springer Berlin Heidelberg: 175-190. Tanaka, K., et al. (1980). 'Gas-chromatographic method of analysis for urinary organic acids. II. Description of the procedure, and its application to diagnosis of patients with organic acidurias.' Clin Chem 26(13): 1847-1853. Tsai, I. J., et al. (2014). 'Efficacy and safety of intermittent hemodialysis in infants and young children with inborn errors of metabolism.' Pediatr Nephrol 29(1): 111-116. Vapnik, V., et al. (1996). Support vector method for function approximation, regression estimation and signal processing. Proceedings of the 9th International Conference on Neural Information Processing Systems. Denver, Colorado, MIT Press: 281-287. Wendel, U. and H. Ogier de Baulny (2006). Branched-Chain Organic Acidurias/Acidemias. Inborn Metabolic Diseases: Diagnosis and Treatment. J. Fernandes, J.-M. Saudubray, G. van den Berghe and J. H. Walter. Berlin, Heidelberg, Springer Berlin Heidelberg: 245-262. Zhou, B., et al. (2010). 'SVM-based spectral matching for metabolite identification.' Conf Proc IEEE Eng Med Biol Soc 2010: 756-759. FROM THE INTERNET Figure 2. The demonstration of Support Vector Machine (http://i.imgur.com/WuxyO.png) Figure 3. The demonstration of Random Forest (https://medium.com/@williamkoehrsen/random-forest-simple-explanation-377895a60d2d) Figure 4. The demonstration of Deep Neural Networks (http://neuralnetworksanddeeplearning.com/chap6.html) Figure 5. The demonstration of Convolution Neural Networks (https://blogs.sap.com/2015/01/14/image-classification-with-convolutional-neural-networks-my-attempt-at-the-ndsb-kaggle-competition/) MANUAL Figure 6. A concept flowchart of process Thermal Fisher Science : GC-MS 概論 p.8 Figure 7. GC-MS data handling Thermal Fisher Science : GC-MS 概論 p.2 Figure 8. GC-MS retention time、m/z and abundance Thermal Fisher Science : GC-MS 概論 p.43 | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78764 | - |
dc.description.abstract | 背景
有機酸血症 (Organic acidemia) 是一類會導致急性發病的先天代謝異常疾病。患者通常在出生後沒多久便會出現代謝性酸中毒、高血氨、高乳酸或低血糖等症狀,嚴重可能致命。確診後患者必須終身進行飲食控制(低蛋白飲食),並服用肉鹼以排出有機酸。然而,有機酸血症檢查診斷的判讀極為繁瑣,檢驗操作者及臨床醫師常需要花費大量的時間在確認物質及判讀。雖然已有自動化化合物標註的系統,仍需仰賴人力來進行判讀。近年來人工智慧的發展,許多醫療檢驗已可藉由機器學習來協助臨床檢測。因此,我們希望能利用機器學習建立一套人工智慧輔助系統來改進有機酸血症的判讀工作,減輕醫事人員的負擔,增進醫療品質,促進人工智慧對於醫療的應用。 目的 探討以機器學習建立有機酸血症檢測判讀的可行性,促進人工智慧對於醫療的應用。 資料與方法 我們先收集臨床報告上診斷為有機酸血症患者的氣相層析質譜儀(Gas chromatography-mass spectrometry; GC-MS)圖譜及化合物相對定量資料,然後進行疾病確認以及疾病分類的兩階段機器學習,先以卷積神經網路、支持向量機、隨機森林演算法及深度神經網路進行臨床上匿名的正常樣本及匿名的確診患者之尿液檢測結果的有機酸資料集進行是否患病的辨認訓練及測試。之後我們再以臨床確診匿名患者之尿液檢測結果的有機酸資料集進行疾病分類診斷的訓練與測試,並計算判讀之準確率,以探討機器學習模組對於有機酸血症判讀的可行性。 結果 在是否患病的分辨上,四種機器學習的F1值皆可達到0.8以上,其中最好的是深度神經網路F1=0.84及AUC=0.85;在全部18種(含正常對照組)有機酸血症疾病的分類上,卷積神經網路的F1=0.68,AUC=0.93;在13種(含正常對照組)有機酸血症疾病的分類上(剔除掉四種標誌化合物不在資料集中的有機酸血症疾病和粒線體疾病),表現最好的是深度神經網路F1=0.90及AUC=0.98,其次為隨機森林分類法F1=0.88,AUC=0.97和支持向量機F1=0.85,AUC=0.95。 結論 藉由訓練之後的人工智慧模組,證明機器學習可應用在有機酸血症的氣相層析圖譜及資料判讀,未來將可建立自動化的有機酸血症判讀系統,加速判讀及診斷,爭取提早治療的時機,並有機會應用到其他常見疾病。 | zh_TW |
dc.description.abstract | Background
Organic acidemias (OAs) is a group of inborn error of metabolism that can have acute metabolic decompensation. Patients with OA usually present with life threatening metabolic acidosis, hyperammonemia, lactic acidemia and hypoglycemia shortly after birth. After diagnosis established, OA patients are suggested to have low protein diet and carnitine supplementation to decrease the organic acid in the body. However, the process for compound identification and interpretation is very tedious. Technicians and physicians need to spend lot of time to working on it. Although automated compound identification software is available, the interpretation still need physician. With the development of artificial intelligence, many laboratory testing tasks can be simplified after applying machine learning technology. Therefore, we plan to develop the artificial intelligence system to assist interpretation of urine organic acid analysis result. Aim To set up a module to assist the interpretation of OAs via machine learning, to facilitate the application of artificial intelligence in health. Materials and methods We collected chromatograms and compound response ratio data of the gas chromatography-mass spectrometry (GC-MS) from the clinical reports which was diagnosed as OAs. Then the chromatograms and response ratio data from the GC-MS of the anonymous urine sample used to train the Convolution Neural Networks (CNNs)、Support Vector Machines (SVMs)、Random Forest (RF) and Deep Neural Networks (DNNs) modules to discriminate disease from normal cases and then perform disease classification. The accuracy and predication rate were calculated to evaluate the feasibility of machine learning in the interpretation of OA. Result In disease discrimination, the F1 score of the four machine learning models can reach 0.8 or more, the best of which is the deep neural networks F1=0.84 and AUC=0.85. In terms of disease classification, in all 18 kinds (including the normal control group) OA classification, the performance of convolutional neural network is F1=0.68 and AUC=0.93;in 13 kinds (including the normal control group) OA classification (excluding the four OAs whose target compounds doesn’t exist in the dataset and the mitochondrial dysfunction and stress), the deep neural network achieved highest performance F1=0.90 and AUC=0.98,followed by the random forest classification F1=0.88,AUC=0.97 and support vector machine F1=0.85,AUC=0.95. Conclusion The models we trained to assist the interpretation of OAs were approved its feasibility. We can build the automatic system to facilitate the diagnosis of OAs, in order to improve the management of these patients, and apply to related fields of diseases in the future. | en |
dc.description.provenance | Made available in DSpace on 2021-07-11T15:17:42Z (GMT). No. of bitstreams: 1 ntu-108-P06448011-1.pdf: 1702235 bytes, checksum: bcac49d3ae2218b9c394c3de1bfc5241 (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | CONTENTS
口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iv CONTENTS vi LIST OF FIGURES ix LIST OF TABLES x Chapter 1 Introduction 1 1.1 Organic acidemia 1 1.2 Difficulties of the interpretation for organic acidemia 2 1.3 Machine learning tools 3 1.3.1 Support vector machines (SVMs) 3 1.3.2 Random forest (RF) 4 1.3.3 Deep neural networks (DNNs) 4 1.3.4 Convolution neural networks (CNNs) 5 1.4 Application of deep learning in metabolomics 6 1.5 Aim 7 Chapter 2 Materials and Methods 8 2.1 GC-MS data 8 2.1.1 Data format 10 2.1.2 Data processing 11 2.2 Data preparation for machine learning 12 2.2.1 Input format for CNNs 12 2.2.2 Input format and transformation for SVM, RF and DNNs 13 2.3 In-house organic acidemia GC-MS raw data 15 2.3.1 GC-MS raw files in NTUH database 15 2.3.2 Data augmentation 15 2.3.3 Training and testing sets 16 2.4 Methods 18 2.4.1 Implementation for convolutional neural networks 18 2.4.2 Implementation for the support vector machine 20 2.4.3 Implementation for the random forest 21 2.4.4 Implementation for the deep neural networks 21 2.5 Working flow for different machine learning methods 22 2.6 Performance evaluation 23 Chapter 3 Results 24 3.1 Evaluation of the models on non-targeted compounds 25 3.1.1 Disease discrimination by CNNs 25 3.1.2 Disease categorization by CNNs 26 3.2 Evaluation of the models on targeted compounds 28 3.2.1 Disease discrimination by SVM/RF/DNNs 28 3.2.2 Disease categorization by SVM/RF/DNNs 28 3.2.3 5-fold cross-validation result 30 Chapter 4 Discussion 31 4.1 Proposed approach 31 4.2 Potential other machine learning methods 31 4.3 Limitations 32 4.4 Future works 34 4.5 Conclusion 34 Chapter 5 References 36 | |
dc.language.iso | en | |
dc.title | 有機酸血症的人工智慧判讀輔助系統 | zh_TW |
dc.title | The artificial intelligence system to assist interpretation of urine organic acid analysis | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 陳珮珊(Pai-Shan Chen),賴飛羆(Feipei Lai) | |
dc.subject.keyword | 有機酸血症,機器學習,卷積神經網路,支持向量機,隨機森林,深度神經網路,氣相層析質譜儀,人工智慧, | zh_TW |
dc.subject.keyword | orgainc acidemia,machine learning,convolution neural network,support vector machine,random forest,deep neural network,gas chromatography-mass spectrometry,artificial intelligence, | en |
dc.relation.page | 39 | |
dc.identifier.doi | 10.6342/NTU201901581 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2019-07-19 | |
dc.contributor.author-college | 醫學院 | zh_TW |
dc.contributor.author-dept | 分子醫學研究所 | zh_TW |
顯示於系所單位: | 分子醫學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-P06448011-1.pdf 目前未授權公開取用 | 1.66 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。