外掛式語言模型：利用一個簡單的迴歸模型控制文本生成

楊奈其; Nai-Chi Yang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89744

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	馬偉雲	zh_TW
dc.contributor.advisor	Wei-Yun Ma	en
dc.contributor.author	楊奈其	zh_TW
dc.contributor.author	Nai-Chi Yang	en
dc.date.accessioned	2023-09-20T16:11:42Z	-
dc.date.available	2025-01-01	-
dc.date.copyright	2023-09-20	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-02	-
dc.identifier.citation	Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. Le, and R. Salakhutdinov. TransformerXL: Attentive language models beyond a fixed-length context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2978–2988, Florence, Italy, July 2019. Association for Computational Linguistics. S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, and R. Liu. Plug and play language models: A simple approach to controlled text generation. In International Conference on Learning Representations, 2020. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. M. Ghazvininejad, X. Shi, J. Priyadarshi, and K. Knight. Hafez: an interactive poetry generation system. In Proceedings of ACL 2017, System Demonstrations, pages 43– 48, Vancouver, Canada, July 2017. Association for Computational Linguistics. S. Gururangan, A. Marasović, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A. Smith. Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, Online, July 2020. Association for Computational Linguistics. A. Holtzman, J. Buys, M. Forbes, A. Bosselut, D. Golub, and Y. Choi. Learning to write with cooperative discriminators, 2018. E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models, 2021. M. Khalifa, H. Elsahar, and M. Dymetman. A distributional approach to controlled text generation, 2021. B. Krause, A. D. Gotmare, B. McCann, N. S. Keskar, S. Joty, R. Socher, and N. F. Rajani. GeDi: Generative discriminator guided sequence generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4929–4952, Punta Cana, Dominican Republic, Nov. 2021. Association for Computational Linguistics. X. L. Li and P. Liang. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online, Aug. 2021. Association for Computational Linguistics. A. Liu, M. Sap, X. Lu, S. Swayamdipta, C. Bhagavatula, N. A. Smith, and Y. Choi. DExperts: Decoding-time controlled text generation with experts and anti-experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6691–6706, Online, Aug. 2021. Association for Computational Linguistics. R. Liu, G. Xu, C. Jia, W. Ma, L. Wang, and S. Vosoughi. Data boost: Text data augmentation through reinforcement learning guided conditional generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9031–9041, Online, Nov. 2020. Association for Computational Linguistics. X. Liu, L. Mou, F. Meng, H. Zhou, J. Zhou, and S. Song. Unsupervised paraphrasing by simulated annealing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 302–312, Online, July 2020. Association for Computational Linguistics. F. Mireshghallah, K. Goyal, and T. Berg-Kirkpatrick. Mix and match: Learning-free controllable text generationusing energy language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 401–415, Dublin, Ireland, May 2022. Association for Computational Linguistics. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike, and R. Lowe. Training language models to follow instructions with human feedback, 2022. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al. Improving language understanding by generative pre-training. 2018. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019. K. Yang and D. Klein. FUDGE: Controlled text generation with future discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3511–3535, Online, June 2021. Association for Computational Linguistics. L. Yu, W. Zhang, J. Wang, and Y. Yu. Seqgan: Sequence generative adversarial nets with policy gradient, 2017. H. Zhang and D. Song. DisCup: Discriminator cooperative unlikelihood prompttuning for controllable text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3392–3406, Abu Dhabi, United Arab Emirates, Dec. 2022. Association for Computational Linguistics. D. M. Ziegler, N. Stiennon, J. Wu, T. B. Brown, A. Radford, D. Amodei, P. Christiano, and G. Irving. Fine-tuning language models from human preferences, 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89744	-
dc.description.abstract	大型預訓練語言模型（LLMs）在海量數據的訓練中展示出無與倫比的能力，已經能夠生成與人類極為相似的文本。然而，在不進行微調或增加額外參數的條件下生成符合特定條件的文本，仍然是一個具有挑戰性的任務。目前避免修改語言模型的策略，主要使用 prompts 或外加的分類器。這些分類器被開發用於決定或預測生成的 token 是否有助於達成所需目標。這些方法通過利用所需屬性的預測分數計算梯度，從而在推理階段改變下一個 token 的輸出分佈。然而，這些分類器模型通常需要使用語言模型的潛在狀態為輸入，這阻礙了使用許多現成的黑盒模型或工具。為了克服這些限制，我們提出了外掛式語言模型（PiLM）作為解決方案。PiLM 利用強化學習直接使用黑盒工具協助調整潛在狀態來達成控制文本生成。同時我們訓練一個簡單的回歸模型取代反向傳播梯度這一緩慢的過程，使PiLM幾乎不會增加生成文本所需時間成本。通過在三種控制生成任務上的驗證，我們的方法展示出優於現有的基於梯度更新、加權解碼或使用prompts的方法的成果。	zh_TW
dc.description.abstract	Large-scale pre-trained language models (LLMs), trained on massive datasets, have displayed unrivaled capacity in generating text that closely resembles human-written text. Nevertheless, generating texts adhering to specific conditions without finetuning or the addition of new parameters proves to be a challenging task. Current strategies, which avoid modifying the language model, typically use either prompts or an auxiliary attribute classifier/predictor. These classifiers are developed to determine or predict if a generated token aids in achieving the desired attribute's requirements. These methods manipulate the token output distribution during the inference phase by utilizing the prediction score of the required attribute to compute gradients. However, these classifier models usually need to have the Language Learning Model's (LLM's) latent states as inputs. This requirement obstructs the use of numerous pre-existing black-box attribute models or tools. To address the limitations, we present the Plug-in Language Model (PiLM) as a solution. PiLM leverages reinforcement learning to directly utilize black-box tools, aiding in the adjustment of the latent state for controlled text generation. Furthermore, by replacing the slow process of backpropagation with a simple regression model, PiLM achieves comparable inference time to the original LLM. Through validation on three controlled generation tasks, our approach demonstrated superior performance compared to existing state-of-the-art methods that rely on gradient-based, weighted decoding, or prompt-based methodologies.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-20T16:11:42Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-09-20T16:11:42Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Contents 口試委員審定書 i 摘要 iii Abstract v Contents vii List of Figures ix List of Tables x 1 Introduction 1 2 Related Work 5 3 Model Frameworks 7 3.1 Reinforcement Learning 8 3.2 Controller 10 4 Experiment 13 4.1 Sentiment Control 14 4.2 Topic Control 17 4.3 Language Detoxification 20 4.4 Compare of three tasks 22 4.5 inference speed 22 5 Analysis 23 5.1 Longer future effect 23 5.2 Trends in control attribute 24 5.3 Reset hidden 24 5.4 Dynamic M 25 6 Conclusion 27 References 29 Appendix A — Hyperparameters 33 A.1 PiLM-RL 33 A.2 Controller 33 Appendix B — Prefix set 35 B.1 Sentiment prefixes 35 B.2 Topic prefixes 35 Appendix C — Wordlists 37 C.1 topic wordlists 37 C.2 heldout wordlists 40 Appendix D — Generated outputs 45 D.1 Sentiment Control 45 D.2 Topic Control 47 D.3 Language Detoxification 50 List of Figures 3.1 Overview of PiLM-RL. 8 3.2 Overview of PiLM-Controller. 10 3.3 Schematic diagram of hidden space. 11 5.1 Controller loss. 24 List of Tables 4.1 The result of Sentiment Control. 15 4.2 The result of Topic Control. 18 4.3 The result of Topic Control with prompts. 18 4.4 The result of Language Detoxification 21 4.5 Comparison of inference speed. 22 5.1 Comparison of different n. 23 5.2 Result of reset hidden. 25 5.3 Result of Dynamic M. 25 A.1 Hyperparameter for PiLM-RL. 33 A.2 Hyperparameter for Controller. 33 D.3 Outputs of sentiment control. 45 D.4 Outputs of sentiment control. 46 D.5 Outputs of topic control. 47 D.6 Outputs of topic control with prompts. 48 D.7 Failure outputs of topic control with prompts. 49 D.8 Outputs of Language detoxification. 50	-
dc.language.iso	en	-
dc.subject	控制生成	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	強化學習	zh_TW
dc.subject	預訓練語言模型	zh_TW
dc.subject	自然語言生成	zh_TW
dc.subject	Controlled Generation	en
dc.subject	Natural Language Generation	en
dc.subject	Pre-trained Language Model	en
dc.subject	Reinforcement Learning	en
dc.subject	Machine Learning	en
dc.subject	Deep Learning	en
dc.title	外掛式語言模型：利用一個簡單的迴歸模型控制文本生成	zh_TW
dc.title	Plug-in Language Model: Controlling Text Generation with a Simple Regression Model	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	鄭卜壬	zh_TW
dc.contributor.coadvisor	Pu-Jen Cheng	en
dc.contributor.oralexamcommittee	張嘉惠;陳縕儂	zh_TW
dc.contributor.oralexamcommittee	Chia-Hui Chang;Yun-Nung Chen	en
dc.subject.keyword	控制生成,自然語言生成,預訓練語言模型,強化學習,機器學習,深度學習,	zh_TW
dc.subject.keyword	Controlled Generation,Natural Language Generation,Pre-trained Language Model,Reinforcement Learning,Machine Learning,Deep Learning,	en
dc.relation.page	50	-
dc.identifier.doi	10.6342/NTU202301380	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2023-08-04	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資料科學學位學程	-
dc.date.embargo-lift	2025-01-01	-
顯示於系所單位：	資料科學學位學程

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	687.71 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。