請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58106完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 許永真(Yong-Zhen Xu) | |
| dc.contributor.author | Po-Chun Chen | en |
| dc.contributor.author | 陳柏均 | zh_TW |
| dc.date.accessioned | 2021-06-16T08:06:04Z | - |
| dc.date.available | 2020-10-01 | |
| dc.date.copyright | 2020-07-23 | |
| dc.date.issued | 2020 | |
| dc.date.submitted | 2020-07-16 | |
| dc.identifier.citation | [1] I. Beltagy, K. Lo, and A. Cohan. Scibert: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3606–3611, 2019. [2] O. Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl 1):D267–D270, 2004. [3] A. Bravo, J. Pi˜nero, N. Queralt-Rosinach, M. Rautschka, and L. I. Furlong. Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC bioinformatics, 16(1):55, 2015. [4] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [5] J. Feldman, J. Davison, and A. M. Rush. Commonsense knowledge mining from pretrained models. arXiv preprint arXiv:1909.00505, 2019. [6] E. Fredkin. Trie memory. Commun. ACM, 3(9):490–499, Sept. 1960. [7] A. L. Hayes, S. Smith, D. S. Dhami, and S. Natarajan. Predicting drug-drug interactions from text using nlp and privileged information. [8] S. R. Heller, A. McNaught, I. Pletnev, S. Stein, and D. Tchekhovskoi. Inchi, the iupac international chemical identifier. Journal of cheminformatics, 7(1):23, 2015. [9] D. Hendrycks and K. Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016. [10] K. Huang, J. Altosaar, and R. Ranganath. Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342, 2019. [11] Z. Jiang, F. F. Xu, J. Araki, and G. Neubig. How can we know what language models know? arXiv preprint arXiv:1911.12543, 2019. [12] M. Krenn, F. H¨ase, A. Nigam, P. Friederich, and A. Aspuru-Guzik. Selfies: a robust representation of semantically constrained graphs with an example application in chemistry. arXiv preprint arXiv:1905.13741, 2019. [13] J. Lazarou, B. H. Pomeranz, and P. N. Corey. Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. Jama, 279(15):1200–1205, 1998. [14] I. Lee, J. Keum, and H. Nam. Deepconv-dti: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS computational biology, 15(6):e1007129, 2019. [15] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 2020. [16] J. Li, Y. Sun, R. J. Johnson, D. Sciaky, C.-H. Wei, R. Leaman, A. P. Davis, C. J. Mattingly, T. C. Wiegers, and Z. Lu. Biocreative v cdr task corpus: a resource for chemical disease relation extraction. Database, 2016, 2016. [17] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettle- BIBLIOGRAPHY 31 moyer, and V. Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019. [18] I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. [19] L. Maziarka, T. Danel, S. Mucha, K. Rataj, J. Tabor, and S. Jastrzebski. Molecule attention transformer. arXiv preprint arXiv:2002.08264, 2020. [20] B. McCann, J. Bradbury, C. Xiong, and R. Socher. Learned in translation: Contextualized word vectors. In Advances in Neural Information Processing Systems, pages 6294–6305, 2017. [21] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. [22] M. Neumann, D. King, I. Beltagy, and W. Ammar. Scispacy: Fast and robust models for biomedical natural language processing. 2019. [23] H. Ozt¨urk, A. ¨ Ozg¨ur, and E. Ozkirimli. Deepdta: deep drug–target binding affinity ¨ prediction. Bioinformatics, 34(17):i821–i829, 2018. [24] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018. [25] F. Petroni, T. Rockt¨aschel, P. Lewis, A. Bakhtin, Y. Wu, A. H. Miller, and S. Riedel. Language models as knowledge bases? arXiv preprint arXiv:1909.01066, 2019. [26] N. Poerner, U. Waltinger, and H. Sch¨utze. Bert is not a knowledge base (yet): Factual knowledge vs. name-based reasoning in unsupervised qa. ArXiv, abs/1911.03681, 2019. [27] M. Sahlgren. The distributional hypothesis. Italian Journal of Disability Studies, 20:33–53, 2008. [28] P. Schwaller, T. Laino, T. Gaudin, P. Bolgar, C. A. Hunter, C. Bekas, and A. A. Lee. Molecular transformer: A model for uncertainty-calibrated chemical reaction prediction. ACS central science, 5(9):1572–1583, 2019. [29] R. Sennrich, B. Haddow, and A. Birch. Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 2015. [30] B. Shin, S. Park, K. Kang, and J. C. Ho. Self-attention based molecule representation for predicting drug target interaction. arXiv preprint arXiv:1908.06760, 2019. [31] A. Sinha, Z. Shen, Y. Song, H. Ma, D. Eide, B.-J. Hsu, and K. Wang. An overview of microsoft academic service (mas) and applications. In Proceedings of the 24th international conference on world wide web, pages 243–246, 2015. [32] M. Sun, S. Zhao, C. Gilvary, O. Elemento, J. Zhou, and F. Wang. Graph convolutional networks for computational drug development and discovery. Briefings in bioinformatics, 21(3):919–935, 2020. [33] W. X. X. D. Tam, Leo K. Transformer query-target knowledge discovery (tend): Drug discovery from cord-19. March-April 2020. DLMed Research Group. [34] W. L. Taylor. “cloze procedure”: A new tool for measuring readability. Journalism quarterly, 30(4):415–433, 1953. [35] V. Tshitoyan, J. Dagdelen, L. Weston, A. Dunn, Z. Rong, O. Kononova, K. A. Persson, G. Ceder, and A. Jain. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763):95–98, 2019. [36] E. M. Van Mulligen, A. Fourrier-Reglat, D. Gurwitz, M. Molokhia, A. Nieto, G. Trifiro, J. A. Kors, and L. I. Furlong. The eu-adr corpus: annotated drugs, diseases, targets, and their relationships. Journal of biomedical informatics, 45(5):879–884, 2012. [37] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017. [38] L. L. Wang, O. Tafjord, S. Jain, A. Cohan, S. Skjonsberg, C. Schoenick, N. Botner, and W. Ammar. Extracting evidence of supplement-drug interactions from literature. arXiv preprint arXiv:1909.08135, 2019. [39] D. Weininger. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988. [40] W. Xiong, J. Du, W. Y. Wang, and V. Stoyanov. Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model, 2019. [41] M. Zitnik, M. Agrawal, and J. Leskovec. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34(13):457–466, 2018. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58106 | - |
| dc.description.abstract | 本論文旨在探討能否利用語言模型學習到藥物的隱含知識並且預測藥物間可能的作用。 近期研究顯示詞向量包含的隱含知識能用來發現新的關係,這篇論文以語言模型學習生物醫學知識來預測藥物可能資訊且提出了改良版的預訓練目標來更好的理解藥物知識。藥物不良反應每年花費高達1360億美元,而其主要原因是多重用藥帶來的預期外的藥物間互相作用。因為對於所有上市藥物組合作實驗是不切實際的,預測可能會有反應的藥物組合或是篩選較有可能的組合便非常重要。目前在機器學習領域上的預測主要分為兩類,一類是利用藥物化學結構做預測,另一類根據電子病歷的回報做預測,我們希望能在病患產生藥物不良反應前正確預測藥物間互相作用,我們藉由學習2015年底前發表的醫療文獻,去預測2016到2020年間發現的藥物作用。我們將一句形容藥物間互相作用的句子,藉由遮住其中提到的藥物形成克漏字測驗,在14,184個測試句型、3607個候選藥物中,我們達到的前十預測命中39.8%的比例。我們的結果可能可以幫助改善現在學習目標收集資訊的能力,並且新增預訓練模型的應用。 | zh_TW |
| dc.description.abstract | The main purpose of this thesis is to investigate whether we can use language model to capture latent knowledge of drugs and predict plausible drug-drug interactions. Recent research suggest that word embeddings capture latent knowledge that we can use to induce undiscovered relations. In this thesis, we embed language model with biomedical knowledge to predict plausible relation and propose a modified pre-training schema to better capture latent knowledge of drugs. Adverse Drug Reactions (ADRs) cost over $136 billion a year and the drug-drug interactions (DDIs) due to polypharmacy has been its major cause. Since it is infeasible to run experiments on all pairs of approved drugs, predicting the most plausible interactions and evaluating their probabilities play an important role in DDIs research. This research aims to investigate whether language models may be used to capture latent knowledge of drugs and predict plausible drug-drug interactions. We embed language model with biomedical knowledge to predict plausible relation and propose a modified pre-training schema to better capture latent knowledge of drugs. To validate the utility of the pre-trained language model, we deploy a cloze test on future discovered relation sentences. That is, given a sentence describing a relation between two drugs, we mask out one drug at a time for the model make predictions of the other. The experiments show that we have over 39.8% hits at ten on 14,184 testing sentences over 3607 drugs. Our results may help improve current learning objectives on leverage information for entities and explore the usage of pre-trained language model. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-16T08:06:04Z (GMT). No. of bitstreams: 1 U0001-1507202012290700.pdf: 963886 bytes, checksum: 8651290c4addfc511fee6f4b2a7b19b5 (MD5) Previous issue date: 2020 | en |
| dc.description.tableofcontents | Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Background 4 2.1 Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Multi-Head Attention . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Positional Embedding . . . . . . . . . . . . . . . . . . . . . . 5 2.2 BERT Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 PubMed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Drug Discovery Timeline . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3 Related Work 8 3.1 Chemical Formula Prediction . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Link Prediction Interaction Graph . . . . . . . . . . . . . . . . . . . . 9 3.3 Semantic Word Embeddings . . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 4 Methodology 11 4.1 Learn from Medical Literature . . . . . . . . . . . . . . . . . . . . . . 11 4.1.1 Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1.2 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 Predicting DDIs with BERT . . . . . . . . . . . . . . . . . . . . . . . 13 4.2.1 Cloze Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 Choice of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 5 Experiments 18 5.1 Pretraining Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Cloze Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2.1 Fine-tune Setting . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2.2 Baseline Models . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . 21 5.3 Entity Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.4.1 Customized Entity Embedding . . . . . . . . . . . . . . . . . 23 5.4.2 Enhance Entity Masking Probability . . . . . . . . . . . . . . 23 Chapter 6 Discussion 24 6.1 Noisy Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6.2 How to Induce Knowledge . . . . . . . . . . . . . . . . . . . . . . . . 25 6.3 Removal of Random Token Replacement . . . . . . . . . . . . . . . . 25 Chapter 7 Conclusion and Future Work 27 Bibliography 29 Appendix Chapter A Result on COVID-19 Drug Repurposing 1 | |
| dc.language.iso | en | |
| dc.subject | 語言模型 | zh_TW |
| dc.subject | 知識發現 | zh_TW |
| dc.subject | 類神經網路 | zh_TW |
| dc.subject | 預訓練 | zh_TW |
| dc.subject | 藥物交互作用 | zh_TW |
| dc.subject | Pre-training | en |
| dc.subject | knowledge discovery | en |
| dc.subject | Neural Networks | en |
| dc.subject | Language Model | en |
| dc.subject | Drug-Drug Interactions | en |
| dc.title | 以語言模型學習預測藥物可能關聯 | zh_TW |
| dc.title | Predicting Plausible Relations of Drugs Using Machine Learning with Language Models | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 108-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳蘊儂(Yun-Nong Chen),蔡宗翰(Zong-Han Cai),古倫維,劉昭麟 | |
| dc.subject.keyword | 語言模型,藥物交互作用,預訓練,類神經網路,知識發現, | zh_TW |
| dc.subject.keyword | Language Model,Drug-Drug Interactions,Pre-training,Neural Networks,knowledge discovery, | en |
| dc.relation.page | 36 | |
| dc.identifier.doi | 10.6342/NTU202001534 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2020-07-16 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1507202012290700.pdf 未授權公開取用 | 941.29 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
