Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97161
Title: 利用通用多重提示下的越獄攻擊
Jailbreaking with Universal Multi-Prompts
Authors: 許郁翎
Yu-Ling Hsu
Advisor: 陳尚澤
Shang-Tse Chen
Keyword: 越獄攻擊,集束搜索,大型語言模型,自然語言處理,深度學習,
Jailbreak,Beam Search,Large Language Model,Natural Language Processing,Deep Learning,
Publication Year : 2025
Degree: 碩士
Abstract: 大型語言模型(LLM)近年來快速發展,革新了各種應用,大大提高了便利性和生產力。然而,隨著其強大功能的出現,倫理問題和新型態的攻擊(如越獄攻擊)也隨之產生。雖然許多現有研究由於簡單性和靈活性而專注於個體攻擊策略,但對尋求提升對未見數據可轉移性的通用方法的研究較少。在本文中,我們設計一個方法,用於針對越獄攻擊在通用設定的情境中優化多重提示。此外,我們對方法的設計延伸到防禦的情境上。實驗結果說明我們的方法可以在控制可讀性的情況下達到高攻擊率。
Large language models (LLMs) have seen rapid development in recent years, revolutionizing various applications and significantly enhancing convenience and productivity. However, alongside their impressive capabilities, ethical concerns and new types of attacks, such as jailbreaking, have emerged. While many existing studies focus on individual attack strategies due to their simplicity and flexibility, there is limited research on universal approaches, which seek to find generalizable checkpoints to optimize across datasets and improve transferability to unseen data. In this paper, we introduce JUMP, a method designed to discover adversarial multi-prompts in a universal setting. We also adapt our approach for defense, which we term DUMP. Experimental results show that our method for optimizing universal multi-prompts surpasses existing techniques.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97161
DOI: 10.6342/NTU202500547
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2025-02-28
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-113-1.pdf1.46 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved