請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85237完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 鄭卜壬(Pu-Jen Cheng) | |
| dc.contributor.author | Guan-Yu Lin | en |
| dc.contributor.author | 林冠宇 | zh_TW |
| dc.date.accessioned | 2023-03-19T22:52:08Z | - |
| dc.date.copyright | 2022-08-05 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-08-01 | |
| dc.identifier.citation | A. Aghajanyan, A. Shrivastava, A. Gupta, N. Goyal, L. Zettlemoyer, and S. Gupta. Better finetuning by reducing representational collapse. In International Conference on Learning Representations, 2021. S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems Volume 1, NIPS’15, page 1171–1179, Cambridge, MA, USA, 2015. MIT Press. Z.Y.Dou, P.Liu, H.Hayashi, Z.Jiang, and G.Neubig. GSum:A general framework for guided neural abstractive summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4830–4842, Online, June 2021. Association for Computational Linguistics. A. Fan, D. Grangier, and M. Auli. Controllable abstractive summarization. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 45–54, Melbourne, Australia, July 2018. Association for Computational Lin guistics. Y.Galand Z.Ghahramani. A theoretically grounded application of dropout in recurrent neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 1027–1035, Red Hook, NY, USA, 2016. Curran Associates Inc. I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org. I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015. K. M. Hermann, T. Kočiský, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom. Teaching Machines to Read and Comprehend. arXiv eprints, page arXiv:1506.03340, June 2015. D. Iter, K. Guu, L. Lansing, and D. Jurafsky. Pretraining with contrastive sentence objectives improves discourse performance of language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4859–4870, Online, July 2020. Association for Computational Linguistics. H.Jiang, P.He, W.Chen, X.Liu, J.Gao, and T.Zhao. SMART: Robust and efficient finetuning for pretrained natural language models through principled regularized optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2177–2190, Online, July 2020. Association for Computational Linguistics. S. Kobayashi. Contextual augmentation: Data augmentation by words with paradig matic relations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 452–457, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. S. Lee, D. B. Lee, and S. J. Hwang. Contrastive learning with adversarial pertur bations for conditional text generation. In International Conference on Learning Representations, 2021. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoy anov, and L. Zettlemoyer. Bart: Denoising sequencetosequence pretraining for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871– 7880, 2020. S. Li, D. Lei, P. Qin, and W. Y. Wang. Deep reinforcement learning with distri butional semantic rewards for abstractive summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pages 6038–6044, Hong Kong, China, Nov. 2019. Association for Computational Linguistics. X.Liang, L.Wu, J.Li, Y.Wang, Q.Meng, T.Qin, W.Chen, M.Zhang, and T.Y.Liu. Rdrop: Regularized dropout for neural networks. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, 2021. C.Y. Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain, July 2004. Association for Computational Linguistics. Y. Liu and P. Liu. SimCLS: A simple framework for contrastive learning of abstrac tive summarization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 1065–1072, Online, Aug. 2021. Association for Computational Linguistics. Y.Liu, F.Meng, Y.Chen, J.Xu, and J.Zhou. Confidence aware scheduled sampling for neural machine translation. In Findings of the Association for Computational Linguistics: ACLIJCNLP 2021, pages 2327–2337, Online, Aug. 2021. Association for Computational Linguistics. Y. Liu, F. Meng, Y. Chen, J. Xu, and J. Zhou. Scheduled sampling based on de coding steps for neural machine translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3285–3296, Online and Punta Cana, Dominican Republic, Nov. 2021. Association for Computational Linguistics. T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space, 2013. S. Narayan, S. B. Cohen, and M. Lapata. Don’t give me the details, just the summary! topic aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1797–1807, Brussels, Belgium, Oct.Nov. 2018. Association for Computational Linguistics. X. Pan, M. Wang, L. Wu, and L. Li. Contrastive learning for many to many multilingual neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 244– 258, Online, Aug. 2021. Association for Computational Linguistics. R. Y. Pang and H. He. Text generation by learning from demonstrations. In International Conference on Learning Representations, 2021. R. Pasunuru and M. Bansal. Multireward reinforced summarization with saliency and entailment. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 646–653, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. R. Paulus, C. Xiong, and R. Socher. A deep reinforced model for abstractive sum marization. In International Conference on Learning Representations, 2018. G. Pereyra, G. Tucker, J. Chorowski, L. Kaiser, and G. Hinton. Regularizing neural networks by penalizing confident output distributions. In International Conference on Learning Representations, 2017. W. Qi, Y. Yan, Y. Gong, D. Liu, N. Duan, J. Chen, R. Zhang, and M. Zhou. ProphetNet: Predicting future ngram for sequencetoSequencePretraining. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2401–2410, Online, Nov. 2020. Association for Computational Linguistics. M. Ranzato, S. Chopra, M. Auli, and W. Zaremba. Sequence level training with recurrent neural networks, 2015. M.Sato, J.Suzuki, and S.Kiyono. Effective adversarial regularization for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 204–210, Florence, Italy, July 2019. Association for Computational Linguistics. A. See, P. J. Liu, and C. D. Manning. Get to the point: Summarization with pointer generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancou ver, Canada, July 2017. Association for Computational Linguistics. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014. R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS’99, page 1057–1063, Cambridge, MA, USA, 1999. MIT Press. S. Takase and S. Kiyono. Rethinking perturbations in encoderdecoders for fast training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5767–5780, Online, June 2021. Association for Computational Linguistics. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush. Transformers: Stateoftheart natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, Oct. 2020. Association for Computational Linguistics. Z. Xie, S. I. Wang, J. Li, D. Lévy, A. Nie, D. Jurafsky, and A. Y. Ng. Data noising as smoothing in neural network language models, 2017. S. Xu, X. Zhang, Y. Wu, and F. Wei. Sequence level contrastive learning for text summarization, 2021. J. Zhang, Y. Zhao, M. Saleh, and P. J. Liu. Pegasus: Pretraining with extracted gap sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020. W. Zhang, Y. Feng, F. Meng, D. You, and Q. Liu. Bridging the gap between train ing and inference for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4334–4343, Flo rence, Italy, July 2019. Association for Computational Linguistics. M. Zhong, P. Liu, Y. Chen, D. Wang, X. Qiu, and X. Huang. Extractive summariza tion as text matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6197–6208, Online, July 2020. Association for Computational Linguistics. C. Zhu, Y. Cheng, Z. Gan, S. Sun, T. Goldstein, and J. Liu. Freelb: Enhanced ad versarial training for natural language understanding. In International Conference on Learning Representations, 2020. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85237 | - |
| dc.description.abstract | 強制指導法是訓練自然語言生成模型中相當廣泛被使用的方法,該法既可以增加訓練效率,也可以使訓練過程更加穩定。然而,強制指導法會導致相當著名的暴露偏差問題,亦即訓練時模型使用的訓練資料分布與推理時使用的訓練資料分布並不相同。過去的研究通常使用修改訓練資料分布的方式,將訓練資料調整成較相似生成資料分布的形式。上述做法並未考慮原始資料與調整後資料之間的成對關係,因此我們提出正則化強制指導法,在訓練中運用上述的成對關係,提升模型訓練時的正則化效果。實驗數據顯示,正則化強制指導法可以在常見的文本摘要資料集中達到比先前作法更好的效果,且正則化強制指導法可以被應用至不同的預訓練模型中。 | zh_TW |
| dc.description.abstract | Teacher-forcing is widely used in training sequence generation models to improve sampling efficiency and stabilize training. However, teacher-forcing is known for suffering from the exposure bias problem. Previous works have attempted to address exposure bias by modifying the training data to simulate the data distribution in the inference stage. Nevertheless, they do not consider the pairwise relationship between original training data and modified ones. We propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. Empirically, we show that R-TeaFor outperforms previous summarization state-of-the-art models, and the results can be generalized to different pre-trained models. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T22:52:08Z (GMT). No. of bitstreams: 1 U0001-2007202210415900.pdf: 3669016 bytes, checksum: d29f61a3a781c5c0a4e10a32d732d107 (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | Acknowledgements i 摘要 iv Abstract v Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Regularization 4 2.2 Contrastive Learning 5 2.3 Reinforcement Learning 5 2.4 Sampling-Based Methods 6 Chapter 3 Methodology 7 3.1 Problem Formulation 7 3.2 Pairwise Relationship 9 Chapter 4 Experiments 11 4.1 Experimental Setup 11 4.1.1 Datasets and Metrics 11 4.1.2 Baselines 12 4.1.3 Implementation Details 12 4.2 Main Results 13 4.3 Effects of α and β 14 4.4 Two Augmentation Rates 15 4.5 Augmentation Strategies 16 4.6 Prediction Similarity 17 4.7 Decoding Accuracy 18 Chapter 5 Conclusion 20 References 21 Appendix A — Generation Examples 28 | |
| dc.language.iso | en | |
| dc.subject | 暴露偏差 | zh_TW |
| dc.subject | 正則化 | zh_TW |
| dc.subject | 強制指導法 | zh_TW |
| dc.subject | 文本摘要 | zh_TW |
| dc.subject | 自然語言生成 | zh_TW |
| dc.subject | Teacher-Forcing | en |
| dc.subject | Natural Language Generation | en |
| dc.subject | Text Summarization | en |
| dc.subject | Regularization | en |
| dc.subject | Exposure Bias | en |
| dc.title | 正則化強制指導法於文本摘要的應用 | zh_TW |
| dc.title | RTeaFor: Regularized Teacher-Forcing for Abstractive Summarization | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳信希(HH Chen),陳柏琳(Berlin Chen),馬偉雲(Wei-Yun Ma),姜俊宇(Jyun-Yu Jiang) | |
| dc.subject.keyword | 強制指導法,暴露偏差,正則化,文本摘要,自然語言生成, | zh_TW |
| dc.subject.keyword | Teacher-Forcing,Exposure Bias,Regularization,Text Summarization,Natural Language Generation, | en |
| dc.relation.page | 28 | |
| dc.identifier.doi | 10.6342/NTU202201565 | |
| dc.rights.note | 同意授權(限校園內公開) | |
| dc.date.accepted | 2022-08-02 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| dc.date.embargo-lift | 2022-08-05 | - |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2007202210415900.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 3.58 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
