Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51709
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor魏志平(Chih-Ping Wei)
dc.contributor.authorChin-Hua Changen
dc.contributor.author張晉華zh_TW
dc.date.accessioned2021-06-15T13:45:42Z-
dc.date.available2022-07-15
dc.date.copyright2020-08-21
dc.date.issued2020
dc.date.submitted2020-08-09
dc.identifier.citationAbacha, A. B. and Zweigenbaum, P. (2010). Automatic extraction of semantic relations between medical entities: Application to the treatment relation. In Semantic Mining in Biomedicine.
Abacha, A. B. and Zweigenbaum, P. (2011). Automatic extraction of semantic relations between medical entities: A rule based approach. Journal of Biomedical Semantics, 2(S5):S4.
Arnold, P. and Rahm, E. (2015). Semrep: A repository for semantic mapping. Datenbanksysteme f¨ur Business, Technologie und Web (BTW 2015).
Blaschke, C. and Valencia, A. (2001). Can bibliographic pointers for known biological data be found automatically? Protein interactions as a case study. Comparative and Functional Genomics, 2(4):196–206.
Blaschke, C. and Valencia, A. (2002). The frame-based module of the SUISEKI information extraction system. IEEE Intelligent Systems, 17(2):14–20.
Bodenreider, O. (2004). The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1):D267–D270.
Bordes, A., Usunier, N., Garcia-Dur´an, A., Weston, J., and Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787–2795.
Cho, K., Van Merri¨enboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734.
Craven, M. and Kumlien, J. (1999). Constructing biological knowledge bases by extracting information from text sources. In Proceedings of International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 77–86.
Feng, X., Guo, J., Qin, B., Liu, T., and Liu, Y. (2017). Effective deep memory networks for distant supervised relation extraction. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pages 4002–4008.
Hendrickx, I., Kim, S. N., Kozareva, Z., Nakov, P., S´eaghdha, D. O., Pad´o, S., Pennacchiotti, M., Romano, L., and Szpakowicz, S. (2010). SemEval-2010 task8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 33–38.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8):1735–1780.
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., and Weld, D. S. (2011). Knowledge-based weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, volume 1, pages 541–550.
Kilicoglu, H., Fiszman, M., Rodriguez, A., Shin, D., Ripple, A. M., and Rindflesch, T. C. (2008). Semantic MEDLINE: A web application for managing the results of PubMed searches. In Proceedings of the 3rd International Symposium on Semantic Mining in Biomedicine (SMBM), pages 69–76.
Kilicoglu, H., Shin, D., Fiszman, M., Rosemblat, G., and Rindflesch, T. C. (2012). SemMedDB: A PubMed-scale repository of biomedical semantic predications. Bioinformatics, 28(23):3158–3160.
Kingma, D. P. and Ba, J. L. (2015). Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR).
Kuo, Y.-T. (2019). Using piecewise convolutional neural networks for biomedical relation extraction. Unpublished Master Thesis, Department of Information Management, National Taiwan University, Taiwan, ROC.
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., and Kang, J. (2020). BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240.
Lin, T. Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327.
Lin, Y., Liu, Z., Sun, M., Liu, Y., and Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph completion. In Twenty-ninth AAAI conference on artificial intelligence.
Lin, Y., Shen, S., Liu, Z., Luan, H., and Sun, M. (2016). Neural relation extraction with selective attention over instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 2124–2133.
Lindberg, D. A., Humphreys, B. L., and McCray, A. T. (1993). The unified medical language system. Methods of Information in Medicine, 32(4):281–291.
Liu, S., Chen, K., Chen, Q., and Tang, B. (2016a). Dependency-based convolutional neural network for drug-drug interaction extraction. In Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 1074–1080.
Liu, S., Tang, B., Chen, Q., and Wang, X. (2016b). Drug-drug interaction extraction via convolutional neural networks. Computational and Mathematical Methods in Medicine, 2016.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111–3119.
Mintz, M., Bills, S., Snow, R., and Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 1003–1011.
Quan, C., Hua, L., Sun, X., and Bai, W. (2016). Multichannel convolutional neural network for biological relation extraction. BioMed Research International, 2016. Radford, A., Narasimhan, Karthik Salimans, T., and Sutskever, I. (2018). Improving language understanding by generative pre-training.
Rehurek, R. and Sojka, P. (2010). Software framework for topic modelling with large corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45–50.
Riedel, S., Yao, L., and McCallum, A. (2010). Modeling relations and their mentions without labeled text. In Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 148–163.
Rindflesch, T. C. and Fiszman, M. (2003). The interaction of domain knowledge and linguistic structure in natural language processing: Interpreting hypernymic propositions in biomedical text. Journal of Biomedical Informatics, 36(6):462–477.
Shi, Y., Xiao, Y., and Niu, L. (2019). A brief survey of relation extraction based on distant supervision. In Proceedings of International Conference on Computational Science, pages 293–303.
Surdeanu, M., Tibshirani, J., Nallapati, R., and Manning, C. D. (2012). Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 455–465.
Vashishth, S., Joshi, R., Prayaga, S. S., Bhattacharyya, C., and Talukdar, P. (2020). RESIDE: Improving distantly-supervised neural relation extraction using side information. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1257–1266.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems, pages 5999–6009.
Wang, Z., Zhang, J., Feng, J., and Chen, Z. (2014). Knowledge graph embedding by translating on hyperplanes. In Proceedings of the National Conference on Artificial Intelligence, volume 2, pages 1112–1119.
Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
Zeng, D., Liu, K., Chen, Y., and Zhao, J. (2015). Distant supervision for relation extraction via Piecewise Convolutional Neural Networks. In Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1753–1762.
Zeng, D., Liu, K., Lai, S., Zhou, G., and Zhao, J. (2014). Relation classification via convolutional deep neural network. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014), pages 2335–2344.
Zeng, X., Zeng, D., He, S., Liu, K., and Zhao, J. (2018). Extracting relational facts by an end-to-end neural model with copy mechanism. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pages 506–510.
Zhao, Z., Yang, Z., Luo, L., Lin, H., andWang, J. (2016). Drug drug interaction extraction from biomedical literature using syntax convolutional neural network. Bioinformatics, 32(22):3444–3453.
Zhou, P., Xu, J., Qi, Z., Bao, H., Chen, Z., and Xu, B. (2018). Distant supervision for relation extraction with hierarchical selective attention. Neural Networks, 108:240–247.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51709-
dc.description.abstract關係萃取是資訊萃取的子任務而且可以有許多有趣的延伸應用,例如知識庫構建、新關係預測…等。近期有許多的研究致力於應用神經網絡的技術來解決關係萃取的問題。例如在2015年Zeng等學者提出了PCNN,一種為了關係萃取所設計的神經網絡結構。在給定一句話並標註其中兩個命名實體的情況下,PCNN會輸出對於所有關係類別的可能機率分佈。另外在2019年Kuo學者延伸了PCNN,並提出PCNN-GT技術,其中PCNN-GT添加了兩個給定命名實體的語義類型和語義組別作為額外特徵,並在生物醫學領域實現了更好的性能。
儘管PCNN-GT在生物醫學領域提出了一個良好的結果,但仍有兩個方面需要改進。首先,關係謂詞或命名實體應該有獨立於文句語意的全局知識,這些知識應該可以被表達成全局特徵並且將其與PCNN-GT最初提取的特徵一起考慮在內,以此來輔助關係萃取的任務。另外第二點,”其他”類的樣本並不存在有一致性的語義。如果我們能夠更好地處理”其他”這個關係類別,則可以幫助更多領域的關係萃取任務。
在這項研究中,我們著重於通過構建知識圖譜模型來萃取全局特徵,並提出了KG-PCNN-GT這個模型和兩個結構上的修改方案。根據我們的實驗結果,我們發現全局特徵可以提高關係萃取的表現。另外我們也觀察到,我們提出的兩個修改方案會使模型更保守地預測關係,與KG-PCNN-GT相比提高了精準度和F1分數。
zh_TW
dc.description.abstractRelation extraction is a subtask of information extraction and can support various interesting applications, including knowledge base construction and novel relation prediction. Recent studies have devoted themselves to the use of neural networks for relation extraction. For example, Zeng et al. (2015) proposed PCNN, which is a neural network structure for relation extraction. It needs to feed a context, e1, and e2, and then outputs the probability distribution over all possible relation classes. Kuo (2019) extends PCNN to develop PCNN-GT by adding semantic types and semantic groups of the two given entities as additional input features and results in a better performance in the biomedical domain.
Although PCNN-GT reported a good performance in the biomedical domain, two areas still can be improved. First, given a set of relation predications of the focal domain, this set of relation predications should contain a global knowledge about entities and relations, which can be employed to extract global features to support relation extraction. These global features should be taken into account together with features originally extracted by a PCNN-GT. Second, the semantic meaning of the “OTHERS” class is not internally coherent. If we can handle the OTHERS relation type better, our proposed method will be applicable to and benefits many relation extraction applications in other domains.
In this research, we focus on extracting global features by constructing the knowledge graph embedding model and propose KG-PCNN-GT and two variants. According to our evaluation results, global features can improve the effectiveness of relation extraction. We also observe that the two variants of KG-PCNN-GT that we propose lead the models to predict relations more conservatively, improving the macro precision and macro F1 score, as compared to those achieved by KG-PCNN-GT.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T13:45:42Z (GMT). No. of bitstreams: 1
U0001-0908202014242600.pdf: 2157789 bytes, checksum: 795272c4d71c6ffb8e50f9d0f8838784 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents誌謝 i
摘要 ii
Abstract iii
List of Figures vii
List of Tables viii
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Motivation and Objective . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2 Literature Review 5
2.1 Biomedical Named Entities and Relations . . . . . . . . . . . . . . . . . . 5
2.2 Relation Extraction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Feature-engineering-based methods . . . . . . . . . . . . . . . . . 8
2.2.2 Deep-learning-based methods . . . . . . . . . . . . . . . . . . . . . 9
2.3 Knowledge Graph Embedding . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 3 Methodology 15
3.1 Knowledge Graph Embedding Model . . . . . . . . . . . . . . . . . . . . . 15
3.1.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.2 Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.3 Training Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Knowledge Embeddings Features . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Model Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 Model Architecture Modification . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.1 KG(FT)-PCNN-GT . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.2 KG-PCNN-GT-2P . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.3 KG(FT)-PCNN-GT-2P . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 Training Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.6 Fine-tuning of Outcome Class Selection . . . . . . . . . . . . . . . . . . . . 31
Chapter 4 Empirical Evaluation 32
4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.1 Dataset for Relation Extraction (RE Dataset) . . . . . . . . . . . . 32
4.1.2 Dataset for Knowledge Graph (KG Dataset) . . . . . . . . . . . . . 34
4.2 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Evaluation Procedure and Criteria . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Additional Evaluation Experiments . . . . . . . . . . . . . . . . . . . . . . 42
4.5.1 Effects of Variants of Knowledge Graph (KG) Embedding Model 43
4.5.2 Effect of Knowledge Graph (KG) Fine-tuning Training Strategy of KG(FT)-PCNN-GT . . . . . . . . . . . . . . . . . . . . . . . . . 45
Chapter 5 Conclusion 46
5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References 49
dc.language.isoen
dc.title以知識圖譜模型協助生物醫學領域的關係萃取zh_TW
dc.titleBiomedical Relation Extraction Supporting by Knowledge Graph Embedding Modelen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee吳家齊(Chia-Chi Wu),張詠淳(Yung-Chun Chang)
dc.subject.keyword關係萃取,生醫關係萃取,關係分類,深度學習,分段卷積神經網絡,知識圖譜,zh_TW
dc.subject.keywordRelation extraction,Biomedical relation extraction,Relation classification,Deep learning,Piecewise convolutional neural network,Knowledge graph,en
dc.relation.page55
dc.identifier.doi10.6342/NTU202002711
dc.rights.note有償授權
dc.date.accepted2020-08-10
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
U0001-0908202014242600.pdf
  目前未授權公開取用
2.11 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved