PUB-R：基於回合定位的端到端任務對話系統

林珏廷; Jyue-Ting Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83074

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉	zh_TW
dc.contributor.advisor	Shie-Wei Liao	en
dc.contributor.author	林珏廷	zh_TW
dc.contributor.author	Jyue-Ting Lin	en
dc.date.accessioned	2023-01-06T17:03:01Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-01-06	-
dc.date.issued	2022	-
dc.date.submitted	2022-12-13	-
dc.identifier.citation	[1] W. Lei, X. Jin, M.-Y. Kan, Z. Ren, X. He, and D. Yin, "Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 1437-1447. [2] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017. [3] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, "Language models are unsupervised multitask learners," OpenAI blog, vol. 1, no. 8, p. 9, 2019. [4] B. Peng, C. Li, J. Li, S. Shayandeh, L. Liden, and J. Gao, "Soloist: Few-shot task-oriented dialog with a single pretrained auto-regressive model," arXiv preprint arXiv:2005.05298, 2020. [5] B. Peng, C. Li, J. Li, S. Shayandeh, L. Liden, and J. Gao, "Soloist: Building task bots at scale with transfer learning and machine teaching," Transactions of the Association for Computational Linguistics, vol. 9, pp. 807-824, 2021. [6] Y. Yang, Y. Li, and X. Quan, "Ubar: Towards fully end-to-end task-oriented dialog system with gpt-2," in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 16, pp. 14230-14238. [7] P. Budzianowski et al., "MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling," Brussels, Belgium, oct nov 2018: Association for Computational Linguistics, in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5016-5026, doi: 10.18653/v1/D18-1547. [8] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997. [9] K. Cho et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014. [10] J. D. Williams, M. Henderson, A. Raux, B. Thomson, A. Black, and D. Ramachandran, "The dialog state tracking challenge series," AI Magazine, vol. 35, no. 4, pp. 121-124, 2014. [11] Z. Zhang, R. Takanobu, Q. Zhu, M. Huang, and X. Zhu, "Recent advances and challenges in task-oriented dialog systems," Science China Technological Sciences, vol. 63, no. 10, pp. 2011-2027, 2020. [12] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, "Language models are unsupervised multitask learners," OpenAI blog, vol. 1, no. 8, p. 9, 2019. [13] Y. Zhang, Z. Ou, and Z. Yu, "Task-oriented dialog systems that consider multiple appropriate responses under the same context," in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 05, pp. 9604-9611. [14] T. Nekvinda and O. Dušek, "Shades of BLEU, flavours of success: The case of MultiWOZ," arXiv preprint arXiv:2106.05555, 2021. [15] H. Lee, S. Jo, H. Kim, S. Jung, and T.-Y. Kim, "Sumbt+ larl: Effective multi-domain end-to-end neural task-oriented dialog system," IEEE Access, vol. 9, pp. 116133-116146, 2021. [16] Y. Park, S. Kang, and J. Seo, "An efficient framework for development of task-oriented dialog systems in a smart home environment," Sensors, vol. 18, no. 5, p. 1581, 2018. [17] T. Brown et al., "Language models are few-shot learners," Advances in neural information processing systems, vol. 33, pp. 1877-1901, 2020. [18] T. Li et al., "A Short Study on Compressing Decoder-Based Language Models," arXiv preprint arXiv:2110.08460, 2021. [19] Y. Bengio, R. Ducharme, and P. Vincent, "A neural probabilistic language model," Advances in neural information processing systems, vol. 13, 2000. [20] B. Seidlhofer, "English as a lingua franca," Oxford, 2005. [21] T.-H. Wen et al., "A network-based end-to-end trainable task-oriented dialogue system," arXiv preprint arXiv:1604.04562, 2016. [22] D. B. Abeywickrama, N. Bicocchi, and F. Zambonelli, "SOTA: Towards a general model for self-adaptive systems," in 2012 IEEE 21st International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, 2012: IEEE, pp. 48-53. [23] J. Masche and N.-T. Le, "A review of technologies for conversational systems," in International conference on computer science, applied mathematics and applications, 2017: Springer, pp. 212-225. [24] R. Takanobu, H. Zhu, and M. Huang, "Guided dialog policy learning: Reward estimation for multi-domain task-oriented dialog," arXiv preprint arXiv:1908.10719, 2019. [25] J. Allen, Natural language understanding. Benjamin-Cummings Publishing Co., Inc., 1995. [26] S. Afzal and P. Robinson, "Natural affect data—Collection & annotation in a learning context," in 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009: IEEE, pp. 1-7. [27] I. Glaser, S. Sadegharmaki, B. Komboz, and F. Matthes, "Data Scarcity: Methods to Improve the Quality of Text Classification," in ICPRAM, 2021, pp. 556-564. [28] P. Budzianowski and I. Vulić, "Hello, it's GPT-2--how can I help you? towards the use of pretrained language models for task-oriented dialogue systems," arXiv preprint arXiv:1907.05774, 2019. [29] E. Hosseini-Asl, B. McCann, C.-S. Wu, S. Yavuz, and R. Socher, "A simple language model for task-oriented dialogue," Advances in Neural Information Processing Systems, vol. 33, pp. 20179-20191, 2020. [30] Y. Wu, W. Wu, C. Xing, M. Zhou, and Z. Li, "Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots," arXiv preprint arXiv:1612.01627, 2016. [31] Z. Zhang, J. Li, P. Zhu, H. Zhao, and G. Liu, "Modeling multi-turn conversation with deep utterance aggregation," arXiv preprint arXiv:1806.09102, 2018. [32] R. Lowe, N. Pow, I. Serban, and J. Pineau, "The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems," arXiv preprint arXiv:1506.08909, 2015. [33] Z. Lin, A. Madotto, G. I. Winata, and P. Fung, "MinTL: Minimalist transfer learning for task-oriented dialogue systems," arXiv preprint arXiv:2009.12005, 2020. [34] T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, "Semantically conditioned lstm-based natural language generation for spoken dialogue systems," arXiv preprint arXiv:1508.01745, 2015. [35] N. Lubis et al., "LAVA: Latent action spaces via variational auto-encoding for dialogue policy optimization," arXiv preprint arXiv:2011.09378, 2020. [36] T. Wolf et al., "Huggingface's transformers: State-of-the-art natural language processing," arXiv preprint arXiv:1910.03771, 2019. [37] C. Raffel et al., "Exploring the limits of transfer learning with a unified text-to-text transformer," J. Mach. Learn. Res., vol. 21, no. 140, pp. 1-67, 2020. [38] M. Lewis et al., "Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension," arXiv preprint arXiv:1910.13461, 2019. [39] S. Mehri, T. Srinivasan, and M. Eskenazi, "Structured fusion networks for dialog," arXiv preprint arXiv:1907.10016, 2019. [40] S. Sánchez, “GPT-J, an open-source alternative to GPT-3,” Narrativa, [Online]. Available: https://www.narrativa.com/gpt-j-an-open-source-alternative-to-gpt-3/, 2022. [41] EleutherAI, “ELEUTHERAI/GPT-Neo: An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.,” GPT-Neo GitHub. [Online]. Available: https://github.com/EleutherAI/gpt-neo, 2022. [42] A. Romero, “GPT-4 is coming soon. here's what we know about it,” Medium, 11-Nov-2022. [Online]. Available: https://towardsdatascience.com/gpt-4-is-coming-soon-heres-what-we-know-about-it-64db058cfd45. [43] J. Kulhánek, V. Hudeček, T. Nekvinda, and O. Dušek, “AUGPT: Auxiliary tasks and data augmentation for end-to-end dialogue with pre-trained language models,” Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI, 2021.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83074	-
dc.description.abstract	任務導向對話系統(Task-Oriented Dialog System)在無數行業中都有巨大的需求。可以大大降低客戶服務人員的管理費用並簡化人力資源流程。許多 TOD 系統皆使用 GPT-2 模型作為基底，但這些系統沒有考慮多輪對話的關鍵考慮因素－－對話回合的位置。本論文 PUB-R 為最近的端到端任務對話模型引入了一種新的嵌入輸入方法：“基於回合的定位嵌入方法”（TPEM）。實驗結果顯示：(1) 與之前的 SOTA 模型相比，PUB-R 可以在更短的訓練時間內獲得更好的訓練性能；(2) 可以透過減少訓練所需的 epoch 數而不影響性能來實現模型的快速收斂。我們的實驗成功地改進了以前的“基於回合的定位”的端到端對話系統，在端對端對話評估方法上取得更高的分數，訓練時間更短，並且不需要額外的手動註釋。	zh_TW
dc.description.abstract	In high demand across countless industries, Task-Oriented Dialog (TOD) systems may greatly reduce the overhead costs of customer service personnel and simplify human resource processes. The GPT-2 model is used across a variety of TOD systems, but these systems do not take into account the turn number, a critical consideration for multi-turn dialogs. Our paper PUB-R introduces a new embedding input method for recent end-to-end task dialog models : the “Turn-Based Positioning Embedding Method” (TPEM). Our results show that (1) PUB-R can obtain better training performance in a shorter training time compared with previous SOTA models and (2) rapid model convergence can be achieved by reducing the number of epochs required without compromising performance. Our implementation successfully improves upon previous end-to-end dialog systems in evaluation score of the model with the "turn-based positioning" with shorter training times and without requiring additional manual annotation.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-01-06T17:03:01Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-01-06T17:03:01Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 摘要 ii Abstract iii Contents iv List of Figures vi List of Tables vii Chapter 1　Introduction 1 Chapter 2　Related Work 5 2.1　Transformer and GPT-2 5 2.1.1　Transformer 5 2.1.2　GPT-2 Model 8 2.2　Task-oriented Dialog System 9 2.2.1　MultiWOZ Dataset 9 2.2.2　End-to-end Methods 10 2.2.2.1　SimpleTOD 10 2.2.2.2　Soloist 12 2.2.2.3　MinTL 14 2.2.2.4　UBAR 14 Chapter 3　Method 16 3.1　Training Task-oriented Dialog Model on Session Level 16 3.2　Domain-Adaptive Pre-processing 17 3.3　Architecture and Training Objective 19 3.4　Inferencing 20 Chapter 4　Experiments 22 4.1　Dataset 22 4.2　Implemented Details 23 4.3　Evaluation Metrics 23 4.4　Baseline Models 24 4.5　Experiment Results 25 Chapter 5　Discussion & Analysis 27 5.1　Representation of Turn Positioning 27 5.2　Convergence Speed of Fine-tuned Pre-trained Model 30 Chapter 6　Conclusion 33 6.1　Major Findings 33 6.2　Research Contribution 33 6.3　Implications for Researchers and Practitioners 33 6.4　Future Work 34 BIBLIOGRAPHY 35	-
dc.language.iso	en	-
dc.title	PUB-R：基於回合定位的端到端任務對話系統	zh_TW
dc.title	PUB-R: End-to-End TOD System via Turn-Based Positioning	en
dc.title.alternative	PUB-R: End-to-End TOD System via Turn-Based Positioning	-
dc.type	Thesis	-
dc.date.schoolyear	111-1	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	戴敏育	zh_TW
dc.contributor.coadvisor	Min-Yuh Day	en
dc.contributor.oralexamcommittee	孫瑞鴻;林正偉	zh_TW
dc.contributor.oralexamcommittee	Ray-Hon Sun;Jeng-Wei Lin	en
dc.subject.keyword	自然語言處理,任務導向對話系統,多輪對話,	zh_TW
dc.subject.keyword	Natural Language Processing,Task-oriented dialog system,Multi-turn dialog,	en
dc.relation.page	41	-
dc.identifier.doi	10.6342/NTU202210113	-
dc.rights.note	未授權	-
dc.date.accepted	2022-12-14	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
U0001-1013221206214016.pdf 未授權公開取用	10.25 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。