Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85289
Title: 基於相依剖析任務提出的優化模型
Head Index Refinement Model in Dependency Parsing
Authors: Chin-Yin Ooi
黃瀞瑩
Advisor: 馬偉雲(Wei-Yun Ma)
Co-Advisor: 鄭卜壬(Pu-Jen Cheng)
Keyword: 相依剖析,機器學習,深度學習,優化策略,預訓練Transformer-based模型,Tweebank v2資料集,英文Penn Treebank (PTB)資料集,中文Chinese Treebank (CTB)資料集,
Dependency Parsing,Machine Learning,Deep Learning,Refinement Strategy,Transformer-based pre-trained model,Tweebank v2,Penn Treebank (PTB),Chinese Treebank (CTB),
Publication Year : 2022
Degree: 碩士
Abstract: 近年來,Graph-based 模型已在相依剖析任務上取得了非常優異的成績。他們通過利用機器學習方法計算邊權重,再運用 Maximum Spanning Tree (MST) 演算法建構出最佳的剖析樹。但是,通過以上設定,目標預測標籤之間的相依關係沒有被考慮以及利用到。為了可以更充分的運用到預測標籤,我們提出了一個非常簡單且有效的 Graph-based 優化模型, 稱之為 Head Index Refinement Model (HIRM)。 它主要以預訓練 Transformer-based 模型架構為基礎搭建起來,我們在不修改原模型架構的情況下,做到了對誤判的標籤進行修訂的動作。我們的模型由兩個階段組成:首先,我們設計了矩陣分類器,主要為判斷句子裡頭兩兩單字之間是否存在相依關係,之後再將該輸出矩陣轉換成絕對位子序列格式。在這之後,第二階段我們會對第一階段的輸出進行修訂動作,我們希望第一階段被誤判的標籤可以在第二階段被修訂。最後,我們的模型在 Tweebank v2 資料集上取得了 SOTA 的成績,並且在英文 Penn Treebank (PTB) 和中文 Chinese Treebank (CTB) 資料集也取得了非常不錯的表現。
Recently, graph-based models have achieved remarkable progress in Dependency Parsing. They use machine learning to assign a weight or probability to each possible edge and then construct a Maximum Spanning Tree (MST) from these weighted edges. However, through the design, the inner dependencies among the predicted target labels are not utilized while estimating the probability of each possible edge. To fully take advantage of the predicted head index labels, in this paper, we propose a simple but very effective graph-based refinement framework for dependency parsing, named Head Index Refinement Model (HIRM). It is based on a transformer-based pre-trained language model and fine-tuning architecture, so it revises incorrect head indices without modifying the pre-trained model's architecture. It consists of two stages: first, we design a matrix classifier to independently judge each word pair if they have a dependent relationship or not, then depending on these predicted labels in the form of absolute head position. After that, our model will revise the incorrect head index label predicted by the first-stage model, and we hope that the incorrect head index label will be revised in the second stage. Our model achieves new state-of-the-art results on Tweebank v2 and comparable results on Penn Treebank (PTB) and Chinese Treebank (CTB) with paralleled computation.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85289
DOI: 10.6342/NTU202201815
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2022-08-03
Appears in Collections:資料科學學位學程

Files in This Item:
File SizeFormat 
U0001-2807202210371800.pdf
Access limited in NTU ip range
843.04 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved