Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101148Full metadata record
| ???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
|---|---|---|
| dc.contributor.advisor | 陳尚澤 | zh_TW |
| dc.contributor.advisor | Shang-Tse Chen | en |
| dc.contributor.author | 蘇軒 | zh_TW |
| dc.contributor.author | Hsuan Su | en |
| dc.date.accessioned | 2025-12-31T16:07:11Z | - |
| dc.date.available | 2026-01-01 | - |
| dc.date.copyright | 2025-12-31 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-12-11 | - |
| dc.identifier.citation | Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models, 2021.
Afra Feyza Akyürek, Muhammed Yusuf Kocyigit, Sejin Paik, and Derry Tanti Wijaya. Challenges in measuring bias via open-ended language generation. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 76–76. Association for Computational Linguistics, 2022. Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, and Furu Wei. SpeechT5: Unified-modal encoder-decoder pre-training for spoken language processing. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5723–5738. Association for Computational Linguistics, 2022. Taichi Asami, Ryo Masumura, Yoshikazu Yamaguchi, Hirokazu Masataki, and Yushi Aono. Domain adaptation of dnn acoustic models using knowledge distillation. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5185–5189, 2017. Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations, 2020. Bismarck Bamfo Odoom, Nathaniel Robinson, Elijah Rippeth, Luis Tavarez-Arce, Kenton Murray, Matthew Wiesner, Paul McNamee, Philipp Koehn, and Kevin Duh. Can synthetic speech improve end-to-end conversational speech translation? In Rebecca Knowles, Akiko Eriguchi, and Shivali Goel, editors, Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 167–177, Chicago, USA, September 2024. Association for Machine Translation in the Americas. Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuanjing Huang, and Zhongyu Wei. Disc-medllm: Bridging general large language models and real-world medical consultation, 2023. Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, and Martijn Wieling. Making more of little data: Improving low-resource automatic speech recognition using data augmentation, 2023. Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, and Verena Rieser. SLURP: A spoken language understanding resource package. In Proc. EMNLP, 2020. Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, and Verena Rieser. SLURP: A spoken language understanding resource package. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7252–7262. Association for Computational Linguistics, 2020. Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, and Boris Ginsburg. Text-only domain adaptation for end-to-end asr using integrated text-to-mel-spectrogram generator. In INTERSPEECH 2023, interspeech_2023, page 2928–2932. ISCA, August 2023. Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, and Boris Ginsburg. Text-only domain adaptation for end-to-end asr using integrated text-to-mel-spectrogram generator, 2023. Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, and Boris Ginsburg. Text-only domain adaptation for end-to-end asr using integrated text-to-mel-spectrogram generator. Proc. InterSpeech, 2023. Rishabh Bhardwaj, Do Duc Anh, and Soujanya Poria. Language models are homer simpson! safety re-alignment of fine-tuned language models through task arithmetic, 2024. Shikha Bordia and Samuel R. Bowman. Identifying and reducing gender bias in word-level language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 7–15. Association for Computational Linguistics, 2019. Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matt Sharifi, Dominik Roblek, Olivier Teboul, David Grangier, Marco Tagliasacchi, and Neil Zeghidour. Audiolm: a language modeling approach to audio generation, 2023. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. NeurIPS, 2020. Edresson Casanova, Kelly Davis, Eren Gölge, Görkem Göknar, Iulian Gulea, Logan Hart, Aya Aljafari, Joshua Meyer, Reuben Morais, Samuel Olayemi, and Julian Weber. Xtts: a massively multilingual zero-shot text-to-speech model, 2024. Eric R Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In Proc. CVPR, 2021. Jen-Hao Rick Chang, Ashish Shrivastava, Hema Koppula, Xiaoshuai Zhang, and Oncel Tuzel. Style equalization: Unsupervised learning of controllable generative sequence models. In Proc. ICML, 2022. Derek Chen, Celine Lee, Yunan Lu, Domenic Rosati, and Zhou Yu. Mixture of soft prompts for controllable data generation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14815–14833. Association for Computational Linguistics, 2023. Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, and Hongxia Jin. Alpagasus: Training a better alpaca with fewer data, 2024. Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023. Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et al. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022. Keqi Deng and Philip C Woodland. Adaptable end-to-end asr models using replaceable internal lms and residual softmax. In ICASSP, pages 1–5. IEEE, 2023. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics, 2019. Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruksachatkun, Kai-Wei Chang, and Rahul Gupta. Bold: Dataset and metrics for measuring biases in open-ended language generation. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT '21, page 862–872, New York, NY, USA, 2021. Association for Computing Machinery. Emily Dinan, Angela Fan, Adina Williams, Jack Urbanek, Douwe Kiela, and Jason Weston. Queens are powerful too: Mitigating gender bias in dialogue generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8173–8188, Online, November 2020. Association for Computational Linguistics. Saket Dingliwa, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, and Katrin Kirchhoff. Domain prompts: Towards memory and compute efficient domain adaptation of asr systems. In Proc. Interspeech, 2022. Shachar Don-Yehiya, Elad Venezian, Colin Raffel, Noam Slonim, and Leshem Choshen. ColD fusion: Collaborative descent for distributed multitask finetuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 788–806. Association for Computational Linguistics, 2023. Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, and Zhifang Sui. A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022. Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, and Trevor Darrell. Diversify your vision datasets with automatic diffusion-based augmentation, 2023. Ronen Eldan and Yuanzhi Li. Tinystories: How small can language models be and still speak coherent english?, 2023. Amin Fazel, Wei Yang, Yulan Liu, Roberto Barra-Chicote, Yixiong Meng, Roland Maas, and Jasha Droppo. Synthasr: Unlocking synthetic data for speech recognition, 2021. Amin Fazel, Wei Yang, Yulan Liu, Roberto Barra-Chicote, Yixiong Meng, Roland Maas, and Jasha Droppo. Synthasr: Unlocking synthetic data for speech recognition, 2021. Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, and Jonathan May. Winoqueer: A community-in-the-loop benchmark for anti-lgbtq+ bias in large language models, 2023. Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, and Michael Carbin. Linear mode connectivity and the lottery ticket hypothesis, 2020. Heting Gao, Kaizhi Qian, Junrui Ni, Chuang Gan, Mark A. Hasegawa-Johnson, Shiyu Chang, and Yang Zhang. Speech self-supervised learning using diffusion model synthetic data. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors, Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pages 14790–14810. PMLR, 21–27 Jul 2024. Shayan Gharib, Konstantinos Drossos, Emre Çakir, Dmitriy Serdyuk, and Tuomas Virtanen. Unsupervised adversarial domain adaptation for acoustic scene classification, 2018. Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, and Dinesh Manocha. Synthio: Augmenting small-scale audio classification datasets with synthetic data, 2024. Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli. Chatgpt outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30), July 2023. Charles Goddard, Shamane Siriwardhana, Malikeh Ehghaghi, Luke Meyers, Vladimir Karpukhin, Brian Benedict, Mark McQuade, and Jacob Solawetz. Arcee's MergeKit: A toolkit for merging large language models. In Franck Dernoncourt, Daniel Preoţiuc-Pietro, and Anastasia Shimorina, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 477–485, Miami, Florida, US, November 2024. Association for Computational Linguistics. Sophie Groenwold, Lily Ou, Aesha Parekh, Samhita Honnavalli, Sharon Levy, Diba Mirza, and William Yang Wang. Investigating African-American Vernacular English in transformer-based text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5877–5883. Association for Computational Linguistics, 2020. Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, et al. Conformer: Convolution-augmented transformer for speech recognition. Proc. InterSpeech, 2020. Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, and Ruoming Pang. Conformer: Convolution-augmented Transformer for Speech Recognition. In Proc. Interspeech 2020, pages 5036–5040, 2020. Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, and Yuanzhi Li. Textbooks are all you need, 2023. Benedikt Hilmes, Nick Rossenbach, and Ralf Schlüter. On the effect of purely synthetic training data for different automatic speech recognition architectures. In Synthetic Data's Transformative Role in Foundational Speech Models, syndata4genai_2024, page 46–50. ISCA, August 2024. Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. Proc. ICLR, 2022. Ting-Yao Hu, Mohammadreza Armandpour, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, and Oncel Tuzel. Utilizing imperfect synthetic data to improve speech recognition. In ICASSP, 2022. Po-Sen Huang, Huan Zhang, Ray Jiang, Robert Stanforth, Johannes Welbl, Jack Rae, Vishal Maini, Dani Yogatama, and Pushmeet Kohli. Reducing sentiment bias in language models via counterfactual evaluation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 65–83. Association for Computational Linguistics, 2020. Quzhe Huang, Mingxu Tao, Chen Zhang, Zhenwei An, Cong Jiang, Zhibin Chen, Zirui Wu, and Yansong Feng. Lawyer llama technical report, 2023. Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, and Hung yi Lee. Chat vector: A simple approach to equip llms with instruction following and model alignment in new languages, 2024. Minyoung Huh, Brian Cheung, Jeremy Bernstein, Phillip Isola, and Pulkit Agrawal. Training neural networks from scratch with parallel low-rank adapters, 2024. C. Hutto and Eric Gilbert. Vader: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1):216–225, May 2014. Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic, 2023. Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic. In The Eleventh International Conference on Learning Representations, 2023. Eojin Jeon, Mingyu Lee, Juhyeong Park, Yeachan Kim, Wing-Lam Mok, and SangKeun Lee. Improving bias mitigation through bias experts in natural language understanding. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11053–11066. Association for Computational Linguistics, 2023. Raviraj Joshi and Anupam Singh. A simple baseline for domain adaptation in end to end ASR systems using synthetic data. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), pages 244–249. Association for Computational Linguistics, 2022. Raviraj Joshi and Anupam Singh. A simple baseline for domain adaptation in end to end asr systems using synthetic data. arXiv preprint arXiv:2206.13240, 2022. Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Caiming Xiong, and Richard Socher. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019. Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, and Minjoon Seo. Prometheus: Inducing fine-grained evaluation capability in language models, 2024. Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. In ICML 2022 Workshop on Knowledge Retrieval and Language Models, 2022. Hadas Kotek, Rikker Dockum, and David Q. Sun. Gender bias in llms, 2023. Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 166–172. Association for Computational Linguistics, 2019. Md Tahmid Rahman Laskar, Enamul Hoque, and Jimmy Xiangji Huang. Domain adaptation with pre-trained transformers for query-focused abstractive text summarization. Computational Linguistics, 48(2):279–320, 06 2022. Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, and Luke Zettlemoyer. Branch-train-merge: Embarrassingly parallel training of expert language models, 2022. Xiang Lisa Li and Percy Liang. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597. Association for Computational Linguistics, 2021. Yuang Li, Yu Wu, Jinyu Li, and Shujie Liu. Prompting large language models for zero-shot domain adaptation in speech recognition, 2023. Yuang Li, Yu Wu, Jinyu Li, and Shujie Liu. Prompting large language models for zero-shot domain adaptation in speech recognition. arXiv preprint arXiv:2306.16007, 2023. Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, and Ruslan Salakhutdinov. Towards understanding and mitigating social biases in language models, 2021. Yen-Ting Lin, Alexandros Papangelis, Seokhwan Kim, Sungjin Lee, Devamanyu Hazarika, Mahdi Namazifar, Di Jin, Yang Liu, and Dilek Hakkani-Tur. Selective in-context data augmentation for intent detection using pointwise V-information. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1463–1476. Association for Computational Linguistics, 2023. Haochen Liu, Jamell Dacon, Wenqi Fan, Hui Liu, Zitao Liu, and Jiliang Tang. Does gender matter? towards fairness in dialogue systems. arXiv preprint arXiv:1910.10486, 2019. Haochen Liu, Wentao Wang, Yiqi Wang, Hui Liu, Zitao Liu, and Jiliang Tang. Mitigating gender bias for neural dialogue generation with adversarial learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 893–903. Association for Computational Linguistics, 2020. Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning, 2023. Hongfu Liu, Hengguan Huang, and Ye Wang. Advancing test-time adaptation for acoustic foundation models in open-world shifts, 2024. Lin Long, Rui Wang, Ruixuan Xiao, Junbo Zhao, Xiao Ding, Gang Chen, and Haobo Wang. On LLMs-driven synthetic data generation, curation, and evaluation: A survey. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors, Findings of the Association for Computational Linguistics: ACL 2024, pages 11065–11082, Bangkok, Thailand, August 2024. Association for Computational Linguistics. Kaiji Lu, Piotr Mardziel, Fangjing Wu, Preetam Amancharla, and Anupam Datta. Gender bias in neural natural language processing, 2019. Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, and Daxin Jiang. Wizardcoder: Empowering code large language models with evol-instruct, 2023. Rao Ma, Mengjie Qian, Potsawee Manakul, Mark Gales, and Kate Knill. Can generative large language models perform asr error correction? arXiv preprint arXiv:2307.04172, 2023. Long Mai and Julie Carson-Berndsen. Unsupervised domain adaptation for speech recognition with unsupervised error correction, 2022. Michael Matena and Colin Raffel. Merging models with fisher-weighted averaging, 2022. Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, and Simone Teufel. It's all in the name: Mitigating gender bias with name-based counterfactual data substitution. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5267–5275. Association for Computational Linguistics, 2019. Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. On measuring social biases in sentence encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 622–628. Association for Computational Linguistics, 2019. H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. Communication-efficient learning of deep networks from decentralized data, 2023. Clara Meister and Ryan Cotterell. Language model evaluation beyond perplexity. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5328–5339. Association for Computational Linguistics, 2021. Clara Meister, Tiago Pimentel, Gian Wiher, and Ryan Cotterell. Locally typical sampling. Trans. of ACL, 11:102–121, 2023. Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, and Yifan Gong. Internal language model adaptation with text-only data for end-to-end speech recognition. Proc. InterSpeech, 2022. Ashish Mittal, Sunita Sarawagi, and Preethi Jyothi. In-situ text-only adaptation of speech models with low-overhead speech imputations. In ICLR, 2022. Dena Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Hope Gerlach-Houck, Caryn Herring, and Jia Bin. Lost in transcription: Identifying and quantifying the accuracy biases of automatic speech recognition systems against disfluent speech, 2024. Moin Nadeem, Anna Bethke, and Siva Reddy. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5356–5371, Online, August 2021. Association for Computational Linguistics. Raufun Nahar, Shogo Miwa, and Atsuhiko Kai. Domain adaptation with augmented data by deep neural network based method using re-recorded speech for automatic speech recognition in real environment. Sensors, 22:9945, 12 2022. Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman. CrowS-pairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1953–1967. Association for Computational Linguistics, 2020. OpenAI. Introduction chatgpt. Blog post, 2022. OpenAI. Gpt-4 technical report, 2023. Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback, 2022. Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech: an asr corpus based on public domain audio books. In Proc. ICASSP, 2015. Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech: An asr corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5206–5210, 2015. Tae Jin Park, He Huang, Coleman Hooper, Nithin Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, and Boris Ginsburg. Property-aware multi-speaker data simulation: A probabilistic modelling technique for synthetic data generation, 2023. Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models, 2022. Janne Pylkkönen, Antti Ukkonen, Juho Kilpikoski, Samu Tamminen, and Hannes Heikinheimo. Fast text-only domain adaptation of rnn-transducer prediction network. Proc. InterSpeech, 2021. Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision, 2022. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019. Nick Rossenbach, Ralf Schlüter, and Sakriani Sakti. On the problem of text-to-speech model selection for synthetic data generation in automatic speech recognition, 2024. Nick Rossenbach, Albert Zeyer, Ralf Schlüter, and Hermann Ney. Generating synthetic audio data for attention-based speech recognition systems, 2020. Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, and Gabriel Synnaeve. Code llama: Open foundation models for code, 2024. Sebastian Ruder, Parsa Ghaffari, and John G. Breslin. Data selection strategies for multi-domain sentiment analysis, 2017. Victor Sanh, Albert Webson, Colin Raffel, Stephen Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Teven Le Scao, Stella Biderman, Leo Gao, Thomas Wolf, and Alexander M Rush. Multitask prompted training enables zero-shot task generalization. In International Conference on Learning Representations, 2022. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms, 2017. Yunfan Shao, Linyang Li, Yichuan Ma, Peiji Li, Demin Song, Qinyuan Cheng, Shimin Li, Xiaonan Li, Pengyu Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, and Dahua Lin. Case2code: Learning inductive reasoning with synthetic data, 2024. Shanya Sharma, Manan Dey, and Koustuv Sinha. How sensitive are translation systems to extra contexts? mitigating gender bias in neural machine translation models through relevant contexts, 2022. Emily Sheng, Kai-Wei Chang, Prem Natarajan, and Nanyun Peng. Towards Controllable Biases in Language Generation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3239–3254. Association for Computational Linguistics, 2020. Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. The woman worked as a babysitter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3407–3412. Association for Computational Linguistics, 2019. Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. Societal biases in language generation: Progress and challenges, 2021. Shubhr Singh, Helen L. Bear, and Emmanouil Benetos. Prototypical networks for domain adaptation in acoustic scene classification. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 346–350, 2021. Sebastian U. Stich. Local sgd converges fast and communicates little, 2019. Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, and Hung-yi Lee. Task arithmetic can mitigate synthetic-to-real gap in automatic speech recognition. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 8905–8915, Miami, Florida, USA, November 2024. Association for Computational Linguistics. Hsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren Yang, Gautam Varma Mantena, and Oncel Tuzel. Corpus synthesis for zero-shot asr domain adaptation using large language models. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 12326–12330, 2024. Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, and Phani Sankar Nidadavolu. Multimodal attention merging for improved speech recognition and audio event classification. ArXiv, abs/2312.14378, 2023. Yi-Lin Sung, Linjie Li, Kevin Lin, Zhe Gan, Mohit Bansal, and Lijuan Wang. An empirical study of multimodal model merging. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1563–1575. Association for Computational Linguistics, 2023. Derek Tam, Mohit Bansal, and Colin Raffel. Merging by matching models in task parameter subspaces. Transactions on Machine Learning Research, 2024. Anke Tang, Li Shen, Yong Luo, Han Hu, Bo Du, and Dacheng Tao. Fusionbench: A comprehensive benchmark of deep model fusion, 2024. Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca, 2023. The Flowrite Team. flow-merge. https://https://github.com/flowritecom/flow-merge, 2024. Himanshu Thakur, Atishay Jain, Praneetha Vaddamanu, Paul Pu Liang, and Louis-Philippe Morency. Language models get a gender makeover: Mitigating gender bias with few-shot data interventions, 2023. Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. NeurIPS, 2017. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2023. Tyler Vuong, Karel Mundnich, Dhanush Bekal, Veera Elluru, Srikanth Ronanki, and Sravan Bodapati. AdaBERT-CTC: Leveraging BERT-CTC for text-only domain adaptation in ASR. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 364–371. Association for Computational Linguistics, 2023. Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, and Juan Pino. Fairseq S2T: Fast speech-to-text modeling with fairseq. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations, pages 33–39. Association for Computational Linguistics, 2020. Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, and Juan Pino. fairseq s2t: Fast speech-to-text modeling with fairseq, 2022. Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, and Furu Wei. Neural codec language models are zero-shot text to speech synthesizers, 2023. Ke Wang, Jiahui Zhu, Minjie Ren, Zeming Liu, Shiwei Li, Zongye Zhang, Chenkai Zhang, Xiaoyu Wu, Qiqi Zhan, Qingjie Liu, and Yunhong Wang. A survey on data synthesis and augmentation for large language models, 2024. Rui Wang, Pengyu Cheng, and Ricardo Henao. Toward fairness in text generation via mutual information minimization based on importance sampling, 2023. Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, et al. Espnet: End-to-end speech processing toolkit. arXiv preprint arXiv:1804.00015, 2018. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models, 2023. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models, 2023. Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, and Mohammad Rastegari. Learning neural network subspaces, 2021. Huajian Xin, Daya Guo, Zhihong Shao, Zhizhou Ren, Qihao Zhu, Bo Liu, Chong Ruan, Wenda Li, and Xiaodan Liang. Deepseek-prover: Advancing theorem proving in llms through large-scale synthetic data, 2024. Prateek Yadav, Derek Tam, Leshem Choshen, Colin Raffel, and Mohit Bansal. Ties-merging: Resolving interference when merging models, 2023. Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, and Oncel Tuzel. Text is all you need: Personalizing asr models using controllable speech synthesis. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, 2023. Yue Yu, Yuchen Zhuang, Jieyu Zhang, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, and Chao Zhang. Large language model as attributed training data generator: A tale of diversity and bias, 2023. Shengbin Yue, Wei Chen, Siyuan Wang, Bingxuan Li, Chenchen Shen, Shujun Liu, Yuxuan Zhou, Yao Xiao, Song Yun, Xuanjing Huang, and Zhongyu Wei. Disc-lawllm: Fine-tuning large language models for intelligent legal services, 2023. Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, and Wenhu Chen. Mammoth: Building math generalist models through hybrid instruction tuning, 2023. Zhenrui Yue, Bernhard Kratzwald, and Stefan Feuerriegel. Contrastive domain adaptation for question answering using limited text corpora, 2021. Kwok Chin Yuen, Li Haoyang, and Chng Eng Siong. Asr model adaptation for rare words using synthetic data generated by multiple text-to-speech systems. In 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 1771–1778, 2023. Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J Weiss, Ye Jia, Zhifeng Chen, and Yonghui Wu. Libritts: A corpus derived from librispeech for text-to-speech. arXiv preprint arXiv:1904.02882, 2019. Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning, 2018. Di Zhang, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Wanli Ouyang, Dongzhan Zhou, Shufei Zhang, Mao Su, Han-Sen Zhong, and Yuqiang Li. Chemllm: A chemical large language model, 2024. Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, and Haizhou Li. Huatuogpt, towards taming language model to be a doctor, 2023. Ruoyu Zhang, Yanzeng Li, Yongliang Ma, Ming Zhou, and Lei Zou. Llmaaa: Making large language models as active annotators, 2023. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2979–2989. Association for Computational Linguistics, 2017. Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei Wang, and Kai-Wei Chang. Learning gender-neutral word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4847–4853. Association for Computational Linguistics, 2018. Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei Wang, and Kai-Wei Chang. Learning gender-neutral word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4847–4853, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. Xianrui Zheng, Yulan Liu, Deniz Gunceler, and Daniel Willett. Using synthetic audio to improve the recognition of out-of-vocabulary words in end-to-end asr systems, 2021. Xianrui Zheng, Yulan Liu, Deniz Gunceler, and Daniel Willett. Using synthetic audio to improve the recognition of out-of-vocabulary words in end-to-end asr systems. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5674–5678, 2021. Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. Texygen: A benchmarking platform for text generation models. SIGIR, 2018. Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR '18, page 1097–1100, New York, NY, USA, 2018. Association for Computing Machinery. Ran Zmigrod, Sabrina J. Mielke, Hanna Wallach, and Ryan Cotterell. Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1651–1661. Association for Computational Linguistics, 2019. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101148 | - |
| dc.description.abstract | 本論文聚焦於大規模語言模型與自動語音辨識等基礎模型在自然語言與語音處理領域的應用,儘管此類模型在多種任務中展現出顯著成效,當面臨與訓練分佈差異極大的新領域或族群資料時,效能仍易受限。為克服此挑戰,本論文結合合成資料生成與優化技術,無需大量真實目標域資料,即能有效提升模型的穩健性與公平性。
首先,我們提出一個零樣本的自動語音辨識調適流程,透過大規模語言模型生成特定領域文本,再使用文字轉語音系統將其轉換為語音,成功在公開資料集中達成平均 28.7% 的相對詞錯率下降,且完全不依賴任何真實目標域資料。 接著,我們提出「合成轉真實」方法,針對合成與真實資料間的分佈落差進行優化。此方法利用任務向量運算與參數空間調整,系統性地降低合成資料帶來的偏差,並在實際應用中取得最高可達 30% 的效能提升。 最後,我們研發一套具備弱點覺察的合成資料生成策略,能經由互動式測試自動找出模型的偏見與性能缺口,並針對這些弱點進行精準的合成資料生成。此策略顯著降低如性別偏見等問題,同時維持模型的整體表現。 總體而言,本論文所提出的一系列方法展現了在資源受限情況下,如何透過合成資料與優化機制,兼顧效能與公平性地強化基礎模型,進一步推動其在多元且敏感的真實場域中更可靠地落地應用。 | zh_TW |
| dc.description.abstract | Foundation models, including large language models (LLMs) and automatic speech recognition (ASR) systems, have achieved impressive performance across a range of natural language and speech processing tasks. However, their effectiveness often diminishes when confronted with data from novel domains or demographic groups that diverge from their training distributions. This thesis addresses these challenges by integrating synthetic data generation with innovative optimization techniques to enhance model robustness and fairness without relying on extensive real in-domain datasets.
First, we propose a novel zero-shot ASR adaptation pipeline that leverages LLMs to generate domain-specific text prompts, which are subsequently transformed into speech using a Text-to-Speech (TTS) system. This approach significantly improves ASR performance on unseen domains, achieving an average relative word error rate reduction of 28.7% on the SLURP dataset without any real target domain data. Second, we introduce the SYN2REAL framework to mitigate the synthetic-to-real gap—a distributional mismatch between synthetic and real data. By employing task vector arithmetic and parameter-space optimization, SYN2REAL systematically reduces synthetic artifacts, leading to performance improvements of up to 30% in real-world ASR applications. Lastly, we develop a weakness-aware synthetic data generation strategy that iteratively probes foundation models to identify and mitigate biases and performance gaps. This framework refines data sampling to target specific model weaknesses, effectively reducing issues such as gender bias while preserving overall model performance. Collectively, these contributions demonstrate a resource-efficient pathway to enhance the adaptability and fairness of foundation models, paving the way for their more reliable deployment in diverse and sensitive real-world scenarios. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-12-31T16:07:11Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-12-31T16:07:11Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 vii Abstract ix Contents xi List of Figures xvii List of Tables xxi Chapter 1 Introduction 1 1.1 Motivation 3 1.2 Technical Contributions 4 1.2.1 ASR Data Synthesis Pipeline 4 1.2.2 Synthetic Artifacts Mitigation – Take ASR as an Example 4 1.2.3 Weakness Exploration for Bias Mitigation 5 1.3 Thesis Organization 6 Chapter 2 Background 7 2.1 Synthetic Data for NLP 7 2.1.1 Overview of Synthetic Data in NLP 7 2.1.2 Data Generation Approaches 8 2.1.2.1 Data Augmentation 8 2.1.2.2 Data Synthesis 8 2.1.3 Synthetic Data for Different NLP Tasks 9 2.1.3.1 Mathematical and Scientific Tasks 9 2.1.3.2 Code Generation and Programming 9 2.1.3.3 Medical and Healthcare Applications 10 2.1.3.4 Legal and Professional Text 10 2.1.3.5 General Language Understanding Tasks 11 2.2 Synthetic Data for Automatic Speech Processing 11 2.2.1 Synthetic Data for Speech Recognition 12 2.2.2 Synthetic-to-Real Gap 13 2.2.3 Synthetic Data for Applications other than ASR 14 2.3 Mitigating Bias in LLMs 15 2.3.1 Bias Investigation in Natural Language Generation 15 2.3.2 Bias Mitigation in Natural Language Generation 16 2.4 Domain Adaptation for Automatic Speech Recognition 17 2.4.1 Domain Adaptation to New Acoustic Domain for ASR 17 2.4.2 Domain Adaptation to New Text Domain for ASR 18 2.5 Task Arithmetic and Model Merging 20 2.5.1 Foundations and Basic Principles 20 2.5.2 Advanced Merging Techniques 21 2.5.2.1 Task Vector Approaches 21 2.5.2.2 Weighted Merging Strategies 21 2.5.2.3 Interference Mitigation 22 2.5.2.4 Practical Implementations and Tools 22 2.5.2.5 Applications in Continual Learning 22 Chapter 3 LLM Synthetic Data for Speech Models – Take Automatic Speech Recognition (ASR) as an Example 25 3.1 Synthetic Data to Improve ASR's Robustness on New Text Domains 27 3.2 Introduction 27 3.3 Methodology 29 3.3.1 Text Synthesis with LLMs 29 3.3.2 Controllable Speech Synthesis 31 3.3.3 ASR Model Adaptation 31 3.4 Experimental Setups 32 3.4.1 Dataset 32 3.4.2 Large Language Models 32 3.4.3 Controllable Speech Synthesis (CSS) 33 3.4.4 ASR Model Adaptation 33 3.5 Results and Discussion 34 3.5.1 ASR Adaptation with Synthetic Text Corpus 34 3.5.2 Analysis of ICIF 34 3.6 Summary 37 Chapter 4 Mitigating Synthetic Artifacts When Fine-Tuning Models on Synthetic Data 39 4.1 Synthetic Artifact Mitigation in Automatic Speech Recognition 41 4.2 Introduction 41 4.3 Methodology 43 4.3.1 Domain Mismatch in ASR Domain Adaptation 44 4.3.2 Problem Formulation 44 4.3.3 SYN2REAL Task Vector 44 4.4 Experimental Setups 46 4.4.1 Dataset 47 4.4.2 Text-to-Speech (TTS) Models 47 4.4.3 ASR Models 48 4.5 Results & Discussion 49 4.5.1 What is the efficacy of SYN2REAL task vector? 49 4.5.2 How does SYN2REAL task vector perform across different model sizes? 51 4.5.3 Is SYN2REAL task vector effective on ASR models other than Whisper? 52 4.5.4 Can we form SYN2REAL task vector from other TTS models? 54 4.5.5 What is the impact of the scaling factor λ? 55 4.5.6 Do SYN2REAL task vectors obtained with the same TTS have similar directions? 57 4.6 SYN2REAL Task Vector given Domain Labels 57 4.6.1 Performance of SYN2REAL Ensemble task vector 59 4.6.2 Impact of the number of domains on the performance of SYN2REAL Ensemble task vector 60 4.7 Limitations 61 4.8 Summary 62 Chapter 5 Weakness-aware Synthetic Data Generation Pipeline – Take Gender Fairness as an Example 63 5.1 Synthetic Data for Gender Fairness 65 5.2 Introduction 65 5.3 Related Work 68 5.4 Overview 69 5.4.1 Bias Provocation 70 5.4.2 Bias Mitigation 71 5.5 Bias Provocation Experiments 72 5.5.1 Target LLMs 72 5.5.2 Baselines 73 5.5.3 Experimental Setups 74 5.5.4 Evaluation 76 5.5.5 Results 77 5.6 Bias Mitigation Experiments 77 5.6.1 Experimental Setups 77 5.6.2 Results 78 5.7 Discussion 79 5.7.1 Does the generator produce high-quality test cases? 79 5.7.2 What is the quality of LLMs' responses? 79 5.7.3 Test Cases Analysis: Which words would be easier to provoke LLMs Bias? 81 5.7.4 Does the proposed mitigation method general to other dataset? 82 5.8 Human Evaluation 84 5.9 Summary 85 Chapter 6 Conclusion and Future Works 87 6.1 Conclusion 87 6.2 Future Work 88 6.2.1 Enhancing the Quality of Synthetic Data 88 6.2.2 Adaptive Synthetic Data Generation for ASR 89 6.2.3 Generalizing SYN2REAL to Other AI Applications 89 6.2.4 Interactive Synthetic Data Learning for Bias Mitigation 90 6.2.5 Towards Fully Autonomous AI Training 90 References 93 | - |
| dc.language.iso | en | - |
| dc.subject | 合成數據 | - |
| dc.subject | 大型語言模型 | - |
| dc.subject | 自動語音識別 | - |
| dc.subject | 公平性 | - |
| dc.subject | Synthetic Data | - |
| dc.subject | Large Language Model | - |
| dc.subject | Automatic Speech Recognition | - |
| dc.subject | Fairness | - |
| dc.title | 弭平領域差距與改善模型弱點:運用合成數據提升大型語言與語音模型 | zh_TW |
| dc.title | Bridging Domain Gaps and Fixing Model Weakness: Leveraging Synthetic Data for Large Language and Speech Model Improvement | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 114-1 | - |
| dc.description.degree | 博士 | - |
| dc.contributor.coadvisor | 李宏毅 | zh_TW |
| dc.contributor.coadvisor | Hung-yi Lee | en |
| dc.contributor.oralexamcommittee | Saurav Sahay;孫紹華;陳縕儂;李琳山;陳祝嵩 | zh_TW |
| dc.contributor.oralexamcommittee | Saurav Sahay;Shao-Hua Sun;Yun-Nung Chen;Lin-shan Lee;Chu-Song Chen | en |
| dc.subject.keyword | 合成數據,大型語言模型自動語音識別公平性 | zh_TW |
| dc.subject.keyword | Synthetic Data,Large Language ModelAutomatic Speech RecognitionFairness | en |
| dc.relation.page | 117 | - |
| dc.identifier.doi | 10.6342/NTU202504762 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2025-12-15 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| dc.date.embargo-lift | N/A | - |
| Appears in Collections: | 資訊工程學系 | |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-114-1.pdf Restricted Access | 2.01 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
