Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 共同教育中心
  3. 統計碩士學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98548
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor黃從仁zh_TW
dc.contributor.advisorTsung-Ren Huangen
dc.contributor.author林佐竺zh_TW
dc.contributor.authorZuo-Zhu Linen
dc.date.accessioned2025-08-18T00:50:01Z-
dc.date.available2025-08-18-
dc.date.copyright2025-08-15-
dc.date.issued2025-
dc.date.submitted2025-08-01-
dc.identifier.citationK. Abramski, R. Improta, G. Rossetti, and M. Stella. The “llm world of words"english free association norms generated by large language models. Scientific data, 12(1):1–9, 2025.
A. Agiza, M. Mostagir, and S. Reda. Politune: Analyzing the impact of data selection and fine-tuning on economic and political biases in large language models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 2–12, 2024.
K. Chen, Z. He, J. Yan, T. Shi, and K. Lerman. How susceptible are large language models to ideological manipulation?, 2024a. URL https://arxiv.org/abs/2402.11725.
L. Chen, S. Li, J. Yan, H. Wang, K. Gunaratna, V. Yadav, Z. Tang, V. Srinivasan, T. Zhou, H. Huang, and H. Jin. Alpagasus: Training a better alpaca with fewer data, 2024b. URL https://arxiv.org/abs/2307.08701.
X. Dai, L. Zhou, B. Wang, and H. Li. From word to world: Evaluate and mitigate culture bias via word association test, 2025. URL https://arxiv.org/abs/2505.18562.
T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer. Qlora: Efficient finetuning of quantized llms, 2023. URL https://arxiv.org/abs/2305.14314.
J. Digutsch and M. Kosinski. Overlap in meaning is a stronger predictor of semantic activation in gpt-3 than in humans. Scientific Reports, 13(1):5035, 2023.
A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughan, A. Yang, A. Fan, A. Goyal, A. Hartshorn, A. Yang, A. Mitra, A. Sravankumar, A. Korenev, A. Hinsvark, A. Rao, A. Zhang, A. Rodriguez, A. Gregerson, A. Spataru, B. Roziere, B. Biron, B. Tang, B. Chern, C. Caucheteux, C. Nayak, C. Bi, C. Marra, C. McConnell, C. Keller, C. Touret, C. Wu, C. Wong, C. C. Ferrer, C. Nikolaidis, D. Allonsius, D. Song, D. Pintz, D. Livshits, D. Wyatt, D. Esiobu, D. Choudhary, D. Mahajan, D. Garcia-Olano, D. Perino, D. Hupkes, E. Lakomkin, E. AlBadawy, E. Lobanova, E. Dinan, E. M. Smith, F. Radenovic, F. Guzmán, F. Zhang, G. Synnaeve, G. Lee, G. L. Anderson, G. Thattai, G. Nail, G. Mialon, G. Pang, G. Cucurell, H. Nguyen, H. Korevaar, H. Xu, H. Touvron, I. Zarov, I. A. Ibarra, I. Kloumann, I. Misra, I. Evtimov, J. Zhang, J. Copet, J. Lee, J. Geffert, J. Vranes, J. Park, J. Mahadeokar, J. Shah, J. van der Linde, J. Billock, J. Hong, J. Lee, J. Fu, J. Chi, J. Huang, J. Liu, J. Wang, J. Yu, J. Bitton, J. Spisak, J. Park, J. Rocca, J. Johnstun, J. Saxe, J. Jia, K. V. Alwala, K. Prasad, K. Upasani, K. Plawiak, K. Li, K. Heafield, K. Stone, K. ElArini, K. Iyer, K. Malik, K. Chiu, K. Bhalla, K. Lakhotia, L. Rantala-Yeary, L. van der Maaten, L. Chen, L. Tan, L. Jenkins, L. Martin, L. Madaan, L. Malo, L. Blecher, L. Landzaat, L. de Oliveira, M. Muzzi, M. Pasupuleti, M. Singh, M. Paluri, M. Kardas, M. Tsimpoukelli, M. Oldham, M. Rita, M. Pavlova, M. Kambadur, M. Lewis, M. Si, M. K. Singh, M. Hassan, N. Goyal, N. Torabi, N. Bashlykov, N. Bogoychev, N. Chatterji, N. Zhang, O. Duchenne, O. Çelebi, P. Alrassy, P. Zhang, P. Li, P. Vasic, P. Weng, P. Bhargava, P. Dubal, P. Krishnan, P. S. Koura, P. Xu, Q. He, Q. Dong, R. Srinivasan, R. Ganapathy, R. Calderer, R. S. Cabral, R. Stojnic, R. Raileanu, R. Maheswari, R. Girdhar, R. Patel, R. Sauvestre, R. Polidoro, R. Sumbaly, R. Taylor, R. Silva, R. Hou, R. Wang, S. Hosseini, S. Chennabasappa, S. Singh, S. Bell, S. S. Kim, S. Edunov, S. Nie, S. Narang, S. Raparthy, S. Shen, S. Wan, S. Bhosale, S. Zhang, S. Vandenhende, S. Batra, S. Whitman, S. Sootla, S. Collot, S. Gururangan, S. Borodinsky, T. Herman, T. Fowler, T. Sheasha, T. Georgiou, T. Scialom, T. Speckbacher, T. Mihaylov, T. Xiao, U. Karn, V. Goswami, V. Gupta, V. Ramanathan, V. Kerkez, V. Gonguet, V. Do, V. Vogeti, V. Albiero, V. Petrovic, W. Chu, W. Xiong, W. Fu, W. Meers, X. Martinet, X. Wang, X. Wang, X. E. Tan, X. Xia, X. Xie, X. Jia, X. Wang, Y. Goldschlag, Y. Gaur, Y. Babaei, Y. Wen, Y. Song, Y. Zhang, Y. Li, Y. Mao, Z. D. Coudert, Z. Yan, Z. Chen, Z. Papakipos, A. Singh, A. Srivastava, A. Jain, A. Kelsey, A. Shajnfeld, A. Gangidi, A. Victoria, A. Goldstand, A. Menon, A. Sharma, A. Boesenberg, A. Baevski, A. Feinstein, A. Kallet, A. Sangani, A. Teo, A. Yunus, A. Lupu, A. Alvarado, A. Caples, A. Gu, A. Ho, A. Poulton, A. Ryan, A. Ramchandani, A. Dong, A. Franco, A. Goyal, A. Saraf, A. Chowdhury, A. Gabriel, A. Bharambe, A. Eisenman, A. Yazdan, B. James, B. Maurer, B. Leonhardi, B. Huang, B. Loyd, B. D. Paola, B. Paranjape, B. Liu, B. Wu, B. Ni, B. Hancock, B. Wasti, B. Spence, B. Stojkovic, B. Gamido, B. Montalvo, C. Parker, C. Burton, C. Mejia, C. Liu, C. Wang, C. Kim, C. Zhou, C. Hu, C.-H. Chu, C. Cai, C. Tindal, C. Feichtenhofer, C. Gao, D. Civin, D. Beaty, D. Kreymer, D. Li, D. Adkins, D. Xu, D. Testuggine, D. David, D. Parikh, D. Liskovich, D. Foss, D. Wang, D. Le, D. Holland, E. Dowling, E. Jamil, E. Montgomery, E. Presani, E. Hahn, E. Wood, E.-T. Le, E. Brinkman, E. Arcaute, E. Dunbar, E. Smothers, F. Sun, F. Kreuk, F. Tian, F. Kokkinos, F. Ozgenel, F. Caggioni, F. Kanayet, F. Seide, G. M. Florez, G. Schwarz, G. Badeer, G. Swee, G. Halpern, G. Herman, G. Sizov, Guangyi, Zhang, G. Lakshminarayanan, H. Inan, H. Shojanazeri, H. Zou, H. Wang, H. Zha, H. Habeeb, H. Rudolph, H. Suk, H. Aspegren, H. Goldman, H. Zhan, I. Damlaj, I. Molybog, I. Tufanov, I. Leontiadis, I.-E. Veliche, I. Gat, J. Weissman, J. Geboski, J. Kohli, J. Lam, J. Asher, J.-B. Gaya, J. Marcus, J. Tang, J. Chan, J. Zhen, J. Reizenstein, J. Teboul, J. Zhong, J. Jin, J. Yang, J. Cummings, J. Carvill, J. Shepard, J. McPhie, J. Torres, J. Ginsburg, J. Wang, K. Wu, K. H. U, K. Saxena, K. Khandelwal, K. Zand, K. Matosich, K. Veeraraghavan, K. Michelena, K. Li, K. Jagadeesh, K. Huang, K. Chawla, K. Huang, L. Chen, L. Garg, L. A, L. Silva, L. Bell, L. Zhang, L. Guo, L. Yu, L. Moshkovich, L. Wehrstedt, M. Khabsa, M. Avalani, M. Bhatt, M. Mankus, M. Hasson, M. Lennie, M. Reso, M. Groshev, M. Naumov, M. Lathi, M. Keneally, M. Liu, M. L. Seltzer, M. Valko, M. Restrepo, M. Patel, M. Vyatskov, M. Samvelyan, M. Clark, M. Macey, M. Wang, M. J. Hermoso, M. Metanat, M. Rastegari, M. Bansal, N. Santhanam, N. Parks, N. White, N. Bawa, N. Singhal, N. Egebo, N. Usunier, N. Mehta, N. P. Laptev, N. Dong, N. Cheng, O. Chernoguz, O. Hart, O. Salpekar, O. Kalinli, P. Kent, P. Parekh, P. Saab, P. Balaji, P. Rittner, P. Bontrager, P. Roux, P. Dollar, P. Zvyagina, P. Ratanchandani, P. Yuvraj, Q. Liang, R. Alao, R. Rodriguez, R. Ayub, R. Murthy, R. Nayani, R. Mitra, R. Parthasarathy, R. Li, R. Hogan, R. Battey, R. Wang, R. Howes, R. Rinott, S. Mehta, S. Siby, S. J. Bondu, S. Datta, S. Chugh, S. Hunt, S. Dhillon, S. Sidorov, S. Pan, S. Mahajan, S. Verma, S. Yamamoto, S. Ramaswamy, S. Lindsay, S. Lindsay, S. Feng, S. Lin, S. C. Zha, S. Patil, S. Shankar, S. Zhang, S. Zhang, S. Wang, S. Agarwal, S. Sajuyigbe, S. Chintala, S. Max, S. Chen, S. Kehoe, S. Satterfield, S. Govindaprasad, S. Gupta, S. Deng, S. Cho, S. Virk, S. Subramanian, S. Choudhury, S. Goldman, T. Remez, T. Glaser, T. Best, T. Koehler, T. Robinson, T. Li, T. Zhang, T. Matthews, T. Chou, T. Shaked, V. Vontimitta, V. Ajayi, V. Montanez, V. Mohan, V. S. Kumar, V. Mangla, V. Ionescu, V. Poenaru, V. T. Mihailescu, V. Ivanov, W. Li, W. Wang, W. Jiang, W. Bouaziz, W. Constable, X. Tang, X. Wu, X. Wang, X. Wu, X. Gao, Y. Kleinman, Y. Chen, Y. Hu, Y. Jia, Y. Qi, Y. Li, Y. Zhang, Y. Zhang, Y. Adi, Y. Nam, Yu, Wang, Y. Zhao, Y. Hao, Y. Qian, Y. Li, Y. He, Z. Rait, Z. DeVito, Z. Rosnbrick, Z. Wen, Z. Yang, Z. Zhao, and Z. Ma. The llama 3 herd of models, 2024. URL https://arxiv.org/abs/2407.21783.
P. Haller, A. Aynetdinov, and A. Akbik. Opiniongpt: Modelling explicit biases in instruction-tuned llms, 2023. URL https://arxiv.org/abs/2309.03876.
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685.
J. Hübotter, S. Bongni, I. Hakimi, and A. Krause. Efficiently learning at test-time: Active fine-tuning of llms, 2025. URL https://arxiv.org/abs/2410.08020.
R. Jha, B. Wang, M. Günther, G. Mastrapas, S. Sturua, I. Mohr, A. Koukounas, M. K. Akram, N. Wang, and H. Xiao. Jina-colbert-v2: A general-purpose multilingual late interaction retriever, 2024. URL https://arxiv.org/abs/2408.16672.
K. Killamsetty, D. Sivasubramanian, G. Ramakrishnan, A. De, and R. Iyer. Grad-match: Gradient matching based data subset selection for efficient deep model training, 2021. URL https://arxiv.org/abs/2103.00123.
N. Lee, Y. Bang, T. Yu, A. Madotto, and P. Fung. Neus: Neutral multi-news summarization for mitigating framing bias, 2022. URL https://arxiv.org/abs/2204.04902.
F. Motoki, V. Pinho Neto, and V. Rodrigues. More human than human: measuring chatgpt political bias. Public Choice, 198(1):3–23, 2024.
Z. Nie, Z. Feng, M. Li, C. Zhang, Y. Zhang, D. Long, and R. Zhang. When text embedding meets large language model: A comprehensive survey, 2025. URL https://arxiv.org/abs/2412.09165.
Qwen, :, A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Tang, T. Xia, X. Ren, X. Ren, Y. Fan, Y. Su, Y. Zhang, Y. Wan, Y. Liu, Z. Cui, Z. Zhang, and Z. Qiu. Qwen2.5 technical report, 2025. URL https://arxiv.org/abs/2412.15115.
D. Rozado. The political preferences of llms. PloS one, 19(7):e0306621, 2024.
C. Tao, T. Shen, S. Gao, J. Zhang, Z. Li, Z. Tao, and S. Ma. Llms are also effective embedding models: An in-depth overview, 2024. URL https://arxiv.org/abs/2412.12591.
R. Taori, I. Gulrajani, T. Zhang, Y. Dubois, X. Li, C. Guestrin, P. Liang, and T. B. Hashimoto. Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm.stanford.edu/2023/03/13/alpaca. html, 3(6):7, 2023.
Y. Wang, Y. Kordi, S. Mishra, A. Liu, N. A. Smith, D. Khashabi, and H. Hajishirzi. Self-instruct: Aligning language models with self-generated instructions, 2023. URL https://arxiv.org/abs/2212.10560.
Y. Wang, Y. Deng, G. Wang, T. Li, H. Xiao, and Y. Zhang. The fluency-based semantic network of llms differs from humans. Computers in Human Behavior: Artificial Humans, 3:100103, 2025.
X. Yang, S. Nie, L. Liu, S. Gururangan, U. Karn, R. Hou, M. Khabsa, and Y. Mao. Diversity-driven data selection for language model tuning through sparse autoencoder, 2025. URL https://arxiv.org/abs/2502.14050.
B. Zhang, K. Chang, and C. Li. Simple techniques for enhancing sentence embeddings in generative language models, 2024. URL https://arxiv.org/abs/2404.03921.
C. Zhou, P. Liu, P. Xu, S. Iyer, J. Sun, Y. Mao, X. Ma, A. Efrat, P. Yu, L. Yu, S. Zhang, G. Ghosh, M. Lewis, L. Zettlemoyer, and O. Levy. Lima: Less is more for alignment, 2023. URL https://arxiv.org/abs/2305.11206.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98548-
dc.description.abstract大型語言模型預設的意識形態偏見威脅資訊客觀性,但傳統操控模型立場的方法往往成本高昂且容易受到資料雜訊影響,難以落實。本研究首次提出以認知心理學中的「自由聯想」為基礎的新方法,透過模型自身生成關鍵詞,低成本且高效地篩選建構 IdeoFA 資料集。實驗證明,IdeoFA 資料集能有效地提高模型的意識形態操控能力並具有跨主題泛化性。本方法克服傳統方法之缺點,為未來 LLM 精確控制提供實務上立即可行的解決方案。zh_TW
dc.description.abstractDefault ideological biases in large language models (LLMs) threaten objectivity, yet traditional steering methods are expensive and sensitive to dataset noise, limiting real‑world use. We introduce a free‑association approach that prompts the model to generate keywords, constructing the IdeoFA dataset at minimal cost. Experimental evidence shows that IdeoFA significantly enhances the robustness of ideological controllability and generalises across topics, thereby offering an immediately deployable solution for precise LLM governance.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-18T00:50:01Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-18T00:50:01Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents致謝 i
摘要 iii
Abstract v
目次 vii
圖次 xi
表次 xiii
第一章 緒論 1
第二章 相關文獻 3
2.1 模型的意識形態 3
2.2 訓練資料篩選 3
2.3 自由聯想 4
第三章 研究方法 5
3.1 生成自由聯想詞集 5
3.2 篩選資料集 6
3.2.1 關鍵詞篩選 6
3.2.2 排序篩選:句向量聚合 7
3.2.3 排序篩選:多詞相似度計算 8
3.2.4 模型評分篩選 8
3.3 評估指標 8
3.4 創建 IdeoFA 資料集 9
第四章 實驗流程 11
4.1 實驗設置 11
4.2 衡量模型偏見 11
4.3 尋找最佳資料篩選方法 12
4.4 比較 IdeoFA 與 IdeoINST 14
第五章 實驗結果 15
5.1 自由聯想詞集生成結果 15
5.2 資料篩選方法效能比較 15
5.3 IdeoFA 與 IdeoINST 效果比較 18
第六章 結論與討論 21
6.1 結論 21
6.2 研究限制與未來研究方向 21
參考文獻 23
附錄 A — 提示模板 31
A.1 生成自由聯想詞(System Prompt) 31
A.2 末位池化 31
A.3 模型評斷相似度(System Prompt) 32
A.4 生成 IdeoFA 指令 32
A.5 生成 IdeoFA 偏見(System Prompt) 35
A.6 生成回應(System Prompt) 35
A.7 生成標籤 35
A.8 評估偏見程度(System Prompt) 36
附錄 B — 演算法 37
B.1 嵌入篩選 37
附錄 C — IdeoFA 39
C.1 示例 39
C.2 隨機 1000 筆樣本訓練後意識操控結果比較 40
附錄 D — 超參數設置 43
D.1 模型與訓練參數設置 43
-
dc.language.isozh_TW-
dc.subject意識形態偏見zh_TW
dc.subject大型語言模型zh_TW
dc.subject自由聯想zh_TW
dc.subject資料篩選zh_TW
dc.subject微調zh_TW
dc.subjectFine-Tuningen
dc.subjectData Selectionen
dc.subjectFree Associationen
dc.subjectIdeological Biasen
dc.subjectLLMsen
dc.title有限樣本在語言模型中的高效價值轉換zh_TW
dc.titleEfficient Value Transformation in Language Models Using Limited Training Samplesen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee蔡宗翰;古倫維zh_TW
dc.contributor.oralexamcommitteeTzong-Han Tsai;Lun-Wei Kuen
dc.subject.keyword大型語言模型,意識形態偏見,微調,資料篩選,自由聯想,zh_TW
dc.subject.keywordLLMs,Ideological Bias,Fine-Tuning,Data Selection,Free Association,en
dc.relation.page43-
dc.identifier.doi10.6342/NTU202501741-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-08-06-
dc.contributor.author-college共同教育中心-
dc.contributor.author-dept統計碩士學位學程-
dc.date.embargo-lift2025-08-18-
顯示於系所單位:統計碩士學位學程

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf5.17 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved