Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89744
Title: 外掛式語言模型:利用一個簡單的迴歸模型控制文本生成
Plug-in Language Model: Controlling Text Generation with a Simple Regression Model
Authors: 楊奈其
Nai-Chi Yang
Advisor: 馬偉雲
Wei-Yun Ma
Co-Advisor: 鄭卜壬
Pu-Jen Cheng
Keyword: 控制生成,自然語言生成,預訓練語言模型,強化學習,機器學習,深度學習,
Controlled Generation,Natural Language Generation,Pre-trained Language Model,Reinforcement Learning,Machine Learning,Deep Learning,
Publication Year : 2023
Degree: 碩士
Abstract: 大型預訓練語言模型(LLMs)在海量數據的訓練中展示出無與倫比的能力,已經能夠生成與人類極為相似的文本。然而,在不進行微調或增加額外參數的條件下生成符合特定條件的文本,仍然是一個具有挑戰性的任務。

目前避免修改語言模型的策略,主要使用 prompts 或外加的分類器。這些分類器被開發用於決定或預測生成的 token 是否有助於達成所需目標。這些方法通過利用所需屬性的預測分數計算梯度,從而在推理階段改變下一個 token 的輸出分佈。然而,這些分類器模型通常需要使用語言模型的潛在狀態為輸入,這阻礙了使用許多現成的黑盒模型或工具。

為了克服這些限制,我們提出了外掛式語言模型(PiLM)作為解決方案。PiLM 利用強化學習直接使用黑盒工具協助調整潛在狀態來達成控制文本生成。同時我們訓練一個簡單的回歸模型取代反向傳播梯度這一緩慢的過程,使PiLM幾乎不會增加生成文本所需時間成本。通過在三種控制生成任務上的驗證,我們的方法展示出優於現有的基於梯度更新、加權解碼或使用prompts的方法的成果。
Large-scale pre-trained language models (LLMs), trained on massive datasets, have displayed unrivaled capacity in generating text that closely resembles human-written text. Nevertheless, generating texts adhering to specific conditions without finetuning or the addition of new parameters proves to be a challenging task.

Current strategies, which avoid modifying the language model, typically use either prompts or an auxiliary attribute classifier/predictor. These classifiers are developed to determine or predict if a generated token aids in achieving the desired attribute's requirements. These methods manipulate the token output distribution during the inference phase by utilizing the prediction score of the required attribute to compute gradients. However, these classifier models usually need to have the Language Learning Model's (LLM's) latent states as inputs. This requirement obstructs the use of numerous pre-existing black-box attribute models or tools.

To address the limitations, we present the Plug-in Language Model (PiLM) as a solution. PiLM leverages reinforcement learning to directly utilize black-box tools, aiding in the adjustment of the latent state for controlled text generation. Furthermore, by replacing the slow process of backpropagation with a simple regression model, PiLM achieves comparable inference time to the original LLM. Through validation on three controlled generation tasks, our approach demonstrated superior performance compared to existing state-of-the-art methods that rely on gradient-based, weighted decoding, or prompt-based methodologies.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89744
DOI: 10.6342/NTU202301380
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2025-01-01
Appears in Collections:資料科學學位學程

Files in This Item:
File SizeFormat 
ntu-111-2.pdf687.71 kBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved