Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2409
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor許永真(Jane Yung-jen Hsu)
dc.contributor.authorPeng-Hsuan Lien
dc.contributor.author李朋軒zh_TW
dc.date.accessioned2021-05-13T06:39:53Z-
dc.date.available2018-01-04
dc.date.available2021-05-13T06:39:53Z-
dc.date.copyright2018-01-04
dc.date.issued2017
dc.date.submitted2017-12-14
dc.identifier.citation[1] D. Andor, C. Alberti, D. Weiss, A. Severyn, A. Presta, K. Ganchev, S. Petrov, and M. Collins. Globally normalized transition-based neural networks. arXiv preprint arXiv:1603.06042, 2016.
[2] N. Chinchor and P. Robinson. MUC-7 Named Entity Task Definition. In Proceedings of the 7th Conference on Message Understanding, volume 29, 1997.
[3] J. P. Chiu and E. Nichols. Named Entity Recognition with Bidirectional LSTMCNNs. Transactions of the Association for Computational Linguistics, 4:357–370, 2016.
[4] M. Collins. Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania, 1999.
[5] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12(Aug):2493–2537, 2011.
[6] G. Cybenko. Approximation by Superpositions of a Sigmoidal Function. Mathematics of Control, Signals, and Systems (MCSS), 2(4):303–314, 1989.
[7] J. Daiber, M. Jakob, C. Hokamp, and P. N. Mendes. Improving Efficiency and Accuracy in Multilingual Entity Extraction. In Proceedings of the 9th International Conference on Semantic Systems (I-Semantics), 2013.
[8] C. dos Santos and V. Guimaraes. Boosting Named Entity Recognition with Neural Character Embeddings. In Proceedings of the Fifth Named Entity Workshop, joint with 53rd ACL and the 7th IJCNLP, pages 25–33, 2015.
[9] G. Durrett and D. Klein. A Joint Model for Entity Analysis: Coreference, Typing, and Linking. Transactions of the Association for Computational Linguistics, 2:477– 490, 2014.
[10] J. R. Finkel and C. D. Manning. Joint Parsing and Named Entity Recognition. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 326–334. Association for Computational Linguistics, 2009.
[11] X. Glorot and Y. Bengio. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.
[12] R. Grishman and B. Sundheim. Message Understanding Conference-6: A Brief History. In Coling, volume 96, pages 466–471, 1996.
[13] E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, and R. Weischedel. OntoNotes: The 90% Solution. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pages 57–60. Association for Computational Linguistics, 2006.
[14] O. Irsoy and C. Cardie. Bidirectional Recursive Neural Networks for Token-Level Labeling with Structure. In NIPS Deep Learning Workshop, 2013.
[15] Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush. Character-Aware Neural Language Models. In Thirtieth AAAI Conference on Artificial Intelligence, 2016.
[16] D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv:1412.6980 [cs.LG], 2014.
[17] M. Krallinger, F. Leitner, O. Rabal, M. Vazquez, J. Oyarzabal, and A. Valencia. Overview of the Chemical Compound and Drug Name Recognition (CHEMDNER) Task. In BioCreative challenge evaluation workshop, volume 2, page 2, 2013.
[18] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. Van Kleef, S. Auer, et al. DBpedia–A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web, 6(2):167–195, 2015.
[19] D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A New Benchmark Collection for Text Categorization Research. Journal of machine learning research, 5(Apr):361–397, 2004.
[20] G. Luo, X. Huang, C.-Y. Lin, and Z. Nie. Joint Named Entity Recognition and Disambiguation. In Proc. EMNLP, pages 879–888, 2015.
[21] E. Marsh and D. Perzanowski. MUC-7 Evaluation of IE Technology: Overview of Results. In Proceedings of the seventh message understanding conference (MUC-7), volume 20, 1998.
[22] A. Passos, V. Kumar, and A. McCallum. Lexicon Infused Phrase Embeddings for Named Entity Resolution. In Proceedings of the Eighteenth Conference on Computational Language Learning, pages 78–86, 2014.
[23] J. Pennington, R. Socher, and C. D. Manning. Glove: Global Vectors for Word Representation. In EMNLP, volume 14, pages 1532–1543, 2014.
[24] S. Pradhan, A. Moschitti, N. Xue, H. T. Ng, A. Bj¨orkelund, O. Uryupina, Y. Zhang, and Z. Zhong. Towards Robust Linguistic Analysis using OntoNotes. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 143–152, 2013.
[25] L. Ratinov and D. Roth. Design Challenges and Misconceptions in Named Entity Recognition. In Proceedings of the Thirteenth Conference on Computational Nat ural Language Learning, CoNLL ’09, pages 147–155, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics.
[26] R. Socher. Recursive Deep Learning for Natural Language Processing and Computer Vision. PhD thesis, Department of Computer Science, Stanford University, 2014.
[27] R. Socher, J. Bauer, C. D. Manning, and A. Y. Ng. Parsing with Compositional Vector Grammars. In Proceedings of the ACL conference, 2013.
[28] R. Socher, C. D. Manning, and A. Y. Ng. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks. In Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop, 2010.
[29] R. Socher, A. Perelygin, J. Y.Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, 2013.
[30] K. S. Tai, R. Socher, and C. D. Manning. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processingg, pages 1556–1566, 2015.
[31] E. F. Tjong Kim Sang and F. De Meulder. Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4, pages 142–147. Association for Computational Linguistics, 2003.
[32] R. Weischedel, M. Palmer, M. Marcus, E. Hovy, S. Pradhan, L. Ramshaw, N. Xue, A. Taylor, J. Kaufman, M. Franchini, et al. Ontonotes Release 5.0. Linguistic Data Consortium, Philadelphia, PA, 2013.
[33] S. ˇZitnik and M. Bajec. Token-and Constituent-based Linear-chain CRF with SVM for Named Entity Recognition. In BioCreative Challenge Evaluation Workshop, volume 2, page 144, 2013.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2409-
dc.description.abstract命名實體辨識(NER)是一個找出文字中的命名實體的重要任務,其產出能提供給下游的任務比如自然語言理解使用。此問題常從命名實體所在的文字區段的預測被轉型為線性地預測每個單詞是否屬於某一命名實體的一部分。利用CRF與RNN等模型,這類轉型後的方法取得了很好的成果。然而,每個命名實體都應該是一個語法單元,而線性單詞預測的方法忽略這個資訊。
在本論文中,我們提出一個語法導向的方法以完整利用文字裡的語言結構。要利用階層性的詞組結構,我們首先產生語法剖析樹並將之改變為最小化剖析樹與命名實體之間的不一致的語法圖。然後我們利用雙向遞迴類神經網路(BRNN)去傳遞相關的結構資訊到每一個語法單元。我們利用一個由下往上的遍歷來蒐集局部資訊,以及一個由上往下的遍歷來蒐集全域資訊。實驗顯示此方法可和線性單詞標記法相比,並在OntoNotes 5.0 NER語料上取得了超過87\% F1分數的顯著進步。
zh_TW
dc.description.abstractNamed Entity Recognition (NER) is an important task which locates proper names in text for downstream tasks, e.g. to facilitate natural language understanding. The problem is often casted from structured prediction of text chunks to sequential labeling of tokens. Such sequential approaches have achieved high performance with models like conditional random fields and recurrent neural networks. However, named entities should be linguistic constituents, and sequential token labeling neglects this information.
In the thesis, we propose a constituency-oriented approach which fully utilizes linguistic structures in text. First, to leverage the prior knowledge of hierarchical phrase structures, we generate parses and alter them into constituency graphs that minimize inconsistencies between parses and named entities. Then, we use Bidirectional Recursive Neural Networks (BRNN) to propagate relevant structure information to each constituent. We use a bottom-up pass to capture the local information and a top-down pass to capture the global information. Experiments show that this approach is comparable to sequential token labeling, and significant improvements can be seen on OntoNotes 5.0 NER, with F1 scores over 87\%.
en
dc.description.provenanceMade available in DSpace on 2021-05-13T06:39:53Z (GMT). No. of bitstreams: 1
ntu-106-R04922001-1.pdf: 1600228 bytes, checksum: ed7282cda37a2b2454fecd110101a22c (MD5)
Previous issue date: 2017
en
dc.description.tableofcontentsAcknowledgments i
Abstract iii
List of Figures ix
List of Tables xi
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2 Related Work 6
2.1 NER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Related Neural Models . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Constituents and NER . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Recurrent NN and NER . . . . . . . . . . . . . . . . . . . . . . . . . 11
Chapter 3 Constituency-Oriented NER 13
3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 4 Constituency Graph Generation 18
4.1 Constituency Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Parse Tree Binarization . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.1 Observations of Type-1 . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 Binarizing Parse Trees . . . . . . . . . . . . . . . . . . . . . . 22
4.3.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Pyramid Construction . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.1 Observations of Type-2 . . . . . . . . . . . . . . . . . . . . . . 25
4.4.2 Na¨ıve Pyramids . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.3 The Pyramid Addition Method . . . . . . . . . . . . . . . . . 28
4.5 Dependency Transformation . . . . . . . . . . . . . . . . . . . . . . . 30
Chapter 5 Constituent Classification 33
5.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.1 Word-Level Features . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.2 Character-Level Features . . . . . . . . . . . . . . . . . . . . . 36
5.1.3 Parse Tag Features . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.4 Lexicon Hit Features . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 BRNN-CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.1 Input Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.2 Hidden Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.3 Output Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Prediction Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Chapter 6 Evaluation 51
6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 Parameter Initialization . . . . . . . . . . . . . . . . . . . . . 51
6.1.2 Parameter Optimization . . . . . . . . . . . . . . . . . . . . . 52
6.1.3 CoNLL-2003 Dataset . . . . . . . . . . . . . . . . . . . . . . . 54
6.1.4 OntoNotes 5.0 Dataset . . . . . . . . . . . . . . . . . . . . . . 55
6.1.5 Hyper-Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Major Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3.1 Constituency-Oriented Approach . . . . . . . . . . . . . . . . 61
6.3.2 Constituency Graph Generation . . . . . . . . . . . . . . . . . 63
6.3.3 Constituent Classification . . . . . . . . . . . . . . . . . . . . 64
Chapter 7 Conclusion 67
7.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Bibliography 70
dc.language.isoen
dc.subject實體辨識zh_TW
dc.subject資訊抽取zh_TW
dc.subject類神經網路zh_TW
dc.subjectInformation Extractionen
dc.subjectNeural Networken
dc.subjectNamed Entity Recognitionen
dc.title利用語法結構之雙向遞迴類神經網路於命名實體辨識之研究zh_TW
dc.titleLeveraging Linguistic Structures for Named Entity Recognition with Bidirectional Recursive Neural Networksen
dc.typeThesis
dc.date.schoolyear106-1
dc.description.degree碩士
dc.contributor.oralexamcommittee馬偉雲(Wei-Yun Ma),張嘉惠(Chia-Hui Chang),陳宜欣(Yi-Shin Chen),陳縕儂(Yun-Nung Chen)
dc.subject.keyword實體辨識,類神經網路,資訊抽取,zh_TW
dc.subject.keywordNamed Entity Recognition,Neural Network,Information Extraction,en
dc.relation.page74
dc.identifier.doi10.6342/NTU201704465
dc.rights.note同意授權(全球公開)
dc.date.accepted2017-12-14
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-106-1.pdf1.56 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved