請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2409
標題: | 利用語法結構之雙向遞迴類神經網路於命名實體辨識之研究 Leveraging Linguistic Structures for Named Entity Recognition with Bidirectional Recursive Neural Networks |
作者: | Peng-Hsuan Li 李朋軒 |
指導教授: | 許永真(Jane Yung-jen Hsu) |
關鍵字: | 實體辨識,類神經網路,資訊抽取, Named Entity Recognition,Neural Network,Information Extraction, |
出版年 : | 2017 |
學位: | 碩士 |
摘要: | 命名實體辨識(NER)是一個找出文字中的命名實體的重要任務,其產出能提供給下游的任務比如自然語言理解使用。此問題常從命名實體所在的文字區段的預測被轉型為線性地預測每個單詞是否屬於某一命名實體的一部分。利用CRF與RNN等模型,這類轉型後的方法取得了很好的成果。然而,每個命名實體都應該是一個語法單元,而線性單詞預測的方法忽略這個資訊。
在本論文中,我們提出一個語法導向的方法以完整利用文字裡的語言結構。要利用階層性的詞組結構,我們首先產生語法剖析樹並將之改變為最小化剖析樹與命名實體之間的不一致的語法圖。然後我們利用雙向遞迴類神經網路(BRNN)去傳遞相關的結構資訊到每一個語法單元。我們利用一個由下往上的遍歷來蒐集局部資訊,以及一個由上往下的遍歷來蒐集全域資訊。實驗顯示此方法可和線性單詞標記法相比,並在OntoNotes 5.0 NER語料上取得了超過87\% F1分數的顯著進步。 Named Entity Recognition (NER) is an important task which locates proper names in text for downstream tasks, e.g. to facilitate natural language understanding. The problem is often casted from structured prediction of text chunks to sequential labeling of tokens. Such sequential approaches have achieved high performance with models like conditional random fields and recurrent neural networks. However, named entities should be linguistic constituents, and sequential token labeling neglects this information. In the thesis, we propose a constituency-oriented approach which fully utilizes linguistic structures in text. First, to leverage the prior knowledge of hierarchical phrase structures, we generate parses and alter them into constituency graphs that minimize inconsistencies between parses and named entities. Then, we use Bidirectional Recursive Neural Networks (BRNN) to propagate relevant structure information to each constituent. We use a bottom-up pass to capture the local information and a top-down pass to capture the global information. Experiments show that this approach is comparable to sequential token labeling, and significant improvements can be seen on OntoNotes 5.0 NER, with F1 scores over 87\%. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2409 |
DOI: | 10.6342/NTU201704465 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-106-1.pdf | 1.56 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。