基於相依剖析任務提出的優化模型

Chin-Yin Ooi; 黃瀞瑩

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85289

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	馬偉雲(Wei-Yun Ma)
dc.contributor.author	Chin-Yin Ooi	en
dc.contributor.author	黃瀞瑩	zh_TW
dc.date.accessioned	2023-03-19T22:55:26Z	-
dc.date.copyright	2022-08-03
dc.date.issued	2022
dc.date.submitted	2022-07-28
dc.identifier.citation	[1] D. Chen and C. Manning. A fast and accurate dependency parser using neural net-works. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 740–750, Doha, Qatar, Oct. 2014. Associa-tion for Computational Linguistics. [2] K. Clark, M.-T. Luong, C. D. Manning, and Q. V. Le. Semi-supervised sequence modeling with cross-view training. In EMNLP, 2018. [3] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu. Revisiting pre-trained models for Chinese natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 657–668, Online, Nov. 2020. Association for Computational Linguistics. [4] Y. Cui, W. Che, T. Liu, B. Qin, and Z. Yang. Pre-training with whole word masking for chinese bert, 2021. [5] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computa- tional Linguistics. [6] T. Dozat and C. D. Manning. Deep biaffine attention for neural dependency parsing. In ICLR, 2017. [7] T. Dozat, P. Qi, and C. D. Manning. Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 20–30, Vancouver, Canada, Aug. 2017. Association for Computational Linguistics. [8] L. Kong, N. Schneider, S. Swayamdipta, A. Bhatia, C. Dyer, and N. A. Smith. A dependency parser for tweets. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1001–1012, Doha, Qatar, Oct. 2014. Association for Computational Linguistics. [9] Y. Liu, Y. Zhu, W. Che, B. Qin, N. Schneider, and N. A. Smith. Parsing tweets into Universal Dependencies. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 965–975, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. [10] I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019. [11] M. Marcus, M. Marcinkiewicz, and B. Santorini. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19:313–330, 07 2002. [12] A. Mohammadshahi and J. Henderson. Recursive non-autoregressive graph-to-graph transformer for dependency parsing with iterative refinement, 2020. [13] K. Mrini, F. Dernoncourt, T. Bui, W. Chang, and N. Nakashole. Rethinking self- attention: An interpretable self-attentive encoder-decoder parser. arXiv preprint arXiv:1911.03875, 2019. [14] J. Nivre and R. McDonald. Integrating graph-based and transition-based dependency parsers. In ACL, 2008. [15] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67, 2020. [16] A. M. Rush and S. Petrov. Vine pruning for efficient multi-pass dependency parsing. In EMNLP, 2012. [17] X. Wang, Y. Jiang, N. Bach, T. Wang, Z. Huang, F. Huang, and K. Tu. Au-tomated Concatenation of Embeddings for Structured Prediction. In the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021). Association for Computational Linguistics, Aug. 2021. [18] X. Wang and K. Tu. Second-order neural dependency parsing with message passing and end-to-end training. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 93–99, Suzhou, China, Dec. 2020. Association for Computational Linguistics. [19] D. Weiss, C. Alberti, M. Collins, and S. Petrov. Structured training for neural net-work transition-based parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 323–333, Beijing, China, July 2015. Association for Computational Linguistics. [20] N. XUE, F. XIA, F.-D. CHIOU, and M. PALMER. The penn chinese treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 11(2):207–238, 2005. [21] Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov, and Q. V. Le. Xl-net: Generalized autoregressive pretraining for language understanding. In NeurIPS, 2019. [22] R. B. Yuan Zhang, Tao Lei and T. Jaakkola. Greed is good if randomized: New inference for dependency parsing. In EMNLP, 2014. [23] X. Zheng. Incremental graph-based neural dependency parsing. In ACL, 2017. [24] J. Zhou and H. Zhao. Head-driven phrase structure grammar parsing on Penn treebank. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, July 2019. Association for Computa-tional Linguistics.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85289	-
dc.description.abstract	近年來，Graph-based 模型已在相依剖析任務上取得了非常優異的成績。他們通過利用機器學習方法計算邊權重，再運用 Maximum Spanning Tree (MST) 演算法建構出最佳的剖析樹。但是，通過以上設定，目標預測標籤之間的相依關係沒有被考慮以及利用到。為了可以更充分的運用到預測標籤，我們提出了一個非常簡單且有效的 Graph-based 優化模型，稱之為 Head Index Refinement Model (HIRM)。它主要以預訓練 Transformer-based 模型架構為基礎搭建起來，我們在不修改原模型架構的情況下，做到了對誤判的標籤進行修訂的動作。我們的模型由兩個階段組成：首先，我們設計了矩陣分類器，主要為判斷句子裡頭兩兩單字之間是否存在相依關係，之後再將該輸出矩陣轉換成絕對位子序列格式。在這之後，第二階段我們會對第一階段的輸出進行修訂動作，我們希望第一階段被誤判的標籤可以在第二階段被修訂。最後，我們的模型在 Tweebank v2 資料集上取得了 SOTA 的成績，並且在英文 Penn Treebank (PTB) 和中文 Chinese Treebank (CTB) 資料集也取得了非常不錯的表現。	zh_TW
dc.description.abstract	Recently, graph-based models have achieved remarkable progress in Dependency Parsing. They use machine learning to assign a weight or probability to each possible edge and then construct a Maximum Spanning Tree (MST) from these weighted edges. However, through the design, the inner dependencies among the predicted target labels are not utilized while estimating the probability of each possible edge. To fully take advantage of the predicted head index labels, in this paper, we propose a simple but very effective graph-based refinement framework for dependency parsing, named Head Index Refinement Model (HIRM). It is based on a transformer-based pre-trained language model and fine-tuning architecture, so it revises incorrect head indices without modifying the pre-trained model's architecture. It consists of two stages: first, we design a matrix classifier to independently judge each word pair if they have a dependent relationship or not, then depending on these predicted labels in the form of absolute head position. After that, our model will revise the incorrect head index label predicted by the first-stage model, and we hope that the incorrect head index label will be revised in the second stage. Our model achieves new state-of-the-art results on Tweebank v2 and comparable results on Penn Treebank (PTB) and Chinese Treebank (CTB) with paralleled computation.	en
dc.description.provenance	Made available in DSpace on 2023-03-19T22:55:26Z (GMT). No. of bitstreams: 1 U0001-2807202210371800.pdf: 863278 bytes, checksum: 8959dabcf3439d7d54d2a450b6d5e429 (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	Acknowledgements I 摘要 II Abstract III Contents V List of Figures VIII List of Tables IX Chapter 1 Introduction 1 Chapter 2 Related Work 4 Chapter 3 Head Index Preprocessing 6 Chapter 4 Multi-task Learning 8 Chapter 5 Model Architecture 10 5.1 Head Index Refinement Model 11 5.1.1 Encoder 12 5.1.2 Fusion Layer 13 5.1.3 Matrix Classifier 13 5.1.4 Refinement Model Pre-training 14 5.2 Relation Type Model 15 Chapter 6 Model Training 17 Chapter 7 Experiments 18 7.1 Datasets 18 7.2 Setup 19 7.3 Results 19 7.3.1 English and Chinese Treebank 20 7.3.2 Tweebank v2 21 Chapter 8 Analysis 23 8.1 Refinement Model 23 8.2 Dependency Score 25 8.3 Limited Dataset 25 8.4 Case Study 27 8.5 Iterative Refinement 28 8.6 Sentence Length 29 Chapter 9 Investigation 31 9.1 Change and Freeze Initial Parser 31 9.1.1 Token Classification 31 9.1.2 Seq2Seq 32 9.2 Confidence Score 34 9.2.1 Training Stage 35 9.2.2 Inference Stage 35 9.3 Model Parameters 36 Chapter 10 Synthetic Data Experiments 37 10.1 Refinement Investigation 37 Chapter 11 Conclusions 40 References 41
dc.language.iso	en
dc.subject	中文Chinese Treebank (CTB)資料集	zh_TW
dc.subject	相依剖析	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	優化策略	zh_TW
dc.subject	預訓練Transformer-based模型	zh_TW
dc.subject	Tweebank v2資料集	zh_TW
dc.subject	英文Penn Treebank (PTB)資料集	zh_TW
dc.subject	Machine Learning	en
dc.subject	Penn Treebank (PTB)	en
dc.subject	Tweebank v2	en
dc.subject	Transformer-based pre-trained model	en
dc.subject	Refinement Strategy	en
dc.subject	Deep Learning	en
dc.subject	Dependency Parsing	en
dc.subject	Chinese Treebank (CTB)	en
dc.title	基於相依剖析任務提出的優化模型	zh_TW
dc.title	Head Index Refinement Model in Dependency Parsing	en
dc.type	Thesis
dc.date.schoolyear	110-2
dc.description.degree	碩士
dc.contributor.coadvisor	鄭卜壬(Pu-Jen Cheng)
dc.contributor.oralexamcommittee	陳信希(Hsin-Hsi Chen),陳柏琳(Berlin Chen)
dc.subject.keyword	相依剖析,機器學習,深度學習,優化策略,預訓練Transformer-based模型,Tweebank v2資料集,英文Penn Treebank (PTB)資料集,中文Chinese Treebank (CTB)資料集,	zh_TW
dc.subject.keyword	Dependency Parsing,Machine Learning,Deep Learning,Refinement Strategy,Transformer-based pre-trained model,Tweebank v2,Penn Treebank (PTB),Chinese Treebank (CTB),	en
dc.relation.page	44
dc.identifier.doi	10.6342/NTU202201815
dc.rights.note	同意授權(限校園內公開)
dc.date.accepted	2022-07-29
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資料科學學位學程	zh_TW
dc.date.embargo-lift	2022-08-03	-
顯示於系所單位：	資料科學學位學程

文件中的檔案：

檔案	大小	格式
U0001-2807202210371800.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	843.04 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。