Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58361
Title: | 結合訓練策略以增強轉移式完整中文語篇剖析器 Incorporating Training Strategies for Enhancing a Complete Transition-based Chinese Discourse Parser |
Authors: | Shyh-Shiun Hung 洪士勛 |
Advisor: | 陳信希(Hsin-Hsi Chen) |
Keyword: | 自然語言處理,中文語篇剖析,轉移式剖析器,動態預示,數據不平衡,無監督數據增強, Natural Language Processing,Chinese Discourse Parsing,Shift-reduce Parser,Dynamic Oracle,Imbalanced Data,Unsupervised Data Augmentation, |
Publication Year : | 2020 |
Degree: | 碩士 |
Abstract: | 本文提出了一個獨立且完整的中文語篇剖析器,並可用於實際應用。我們從各個方面改善中文語篇剖析的效能,不僅套用了使用大量語料庫訓練完成的語言模型使我們的工具不需要依賴其他的外部資源,還採用新穎的訓練策略來改進轉移式的剖析器。我們修改了動態預示的訓練流程以更好的訓練轉移式的剖析器,並應用無監督數據增強來增強關係識別的能力,解決了中文資料數據不平衡且含有大量隱式關係的缺點。在中文話語樹庫(CDTB)資料集上,我們的語篇剖析器的性能優於以前的所有模型。實驗結果也證明這些訓練策略的有效性,同時,我們解釋了造成各個實驗結果的原因。我們釋出了這個剖析器的原始碼與預先訓練完成的模型以供研究社群使用。 This work proposes a standalone, complete Chinese discourse parser for practical applications. We approach Chinese discourse parsing from a variety of aspects and improve the shift-reduce parser not only by integrating the pre-trained sentence encoder, but also by employing novel training strategies. We revise the dynamic-oracle procedure for training the shift-reduce parser, and apply unsupervised data augmentation to enhance rhetorical relation recognition. Our Chinese discourse parser outperforms all the previous models in the Chinese Discourse Treebank (CDTB) dataset. Experimental results also prove the effectiveness of these training strategies. Moreover, we explain the reasons for the results of each experiment. Our discourse parser will be released as a resource for the research community. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58361 |
DOI: | 10.6342/NTU202001484 |
Fulltext Rights: | 有償授權 |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
U0001-1307202019304400.pdf Restricted Access | 4.78 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.