利用品質感知的多層池化方法強化文本嵌入

蕭襄; Hsiang Hsiao

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林守德	zh_TW
dc.contributor.advisor	Shou-De Lin	en
dc.contributor.author	蕭襄	zh_TW
dc.contributor.author	Hsiang Hsiao	en
dc.date.accessioned	2025-08-18T01:14:28Z	-
dc.date.available	2025-08-18	-
dc.date.copyright	2025-08-15	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-04	-
dc.identifier.citation	[1] A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182–197, 2002. [2] Qwen2 technical report. 2024. [3] E. Agirre, D. Cer, M. Diab, and A. Gonzalez-Agirre. Semeval-2012 task 6: A pilot on semantic textual similarity.* sem 2012: The first joint conference on lexical and computational semantics—. In Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Montréal, QC, Canada, pages 7–8, 2012. [4] K. K. Agrawal, A. K. Mondal, A. Ghosh, and B. Richards. α-req: Assessing representation quality in self-supervised learning by measuring eigenspectrum decay. Advances in Neural Information Processing Systems, 35:17626–17638, 2022. [5] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparameter optimization framework. In The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631, 2019. [6] P. BehnamGhader, V. Adlakha, M. Mosbach, D. Bahdanau, N. Chapados, and S. Reddy. Llm2vec: Large language models are secretly powerful text encoders. arXiv preprint arXiv:2404.05961, 2024. [7] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 24, 2011. [8] D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055, 2017. [9] Y.-S. Chuang, R. Dangovski, H. Luo, Y. Zhang, S. Chang, M. Soljačić, S.-W. Li, S. Yih, Y. Kim, and J. Glass. Diffcse: Difference-based contrastive learning for sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4207–4218, 2022. [10] DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025. [11] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019. [12] S. Gao, S. Dou, Q. Zhang, and X.-J. Huang. Kernel-whitening: Overcome dataset bias with isotropic sentence embedding. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4112–4122, 2022. [13] T. Gao, X. Yao, and D. Chen. Simcse: Simple contrastive learning of sentence embeddings. In EMNLP 2021-2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2021. [14] Q. Garrido, R. Balestriero, L. Najman, and Y. Lecun. Rankme: Assessing the downstream performance of pretrained self-supervised representations by their rank. In International conference on machine learning, pages 10929–10974. PMLR, 2023. [15] B. He and M. Ozay. Exploring the gap between collapsed & whitened features in self-supervised learning. In International Conference on Machine Learning, pages 8613–8634. PMLR, 2022. [16] G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bojanowski, A. Joulin, and E. Grave. Unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118, 2021. [17] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed. Mistral 7b, 2023. [18] T. Jiang, S. Huang, Z. Luan, D. Wang, and F. Zhuang. Scaling sentence embeddings with large language models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3182–3196, 2024. [19] T. Jiang, J. Jiao, S. Huang, Z. Zhang, D. Wang, F. Zhuang, F. Wei, H. Huang, D. Deng, and Q. Zhang. Promptbert: Improving bert sentence embeddings with prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8826–8837, 2022. [20] P. Ke, H. Ji, S. Liu, X. Zhu, and M. Huang. Sentilare: Sentiment-aware language representation learning with linguistic knowledge. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6975–6988, 2020. [21] C. Lee, R. Roy, M. Xu, J. Raiman, M. Shoeybi, B. Catanzaro, and W. Ping. Nvembed: Improved techniques for training llms as generalist embedding models. arXiv preprint arXiv:2405.17428, 2024. [22] Y. Lei, D. Wu, T. Zhou, T. Shen, Y. Cao, C. Tao, and A. Yates. Meta-task prompting elicits embeddings from large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10141–10157, 2024. [23] C. Li, M. Qin, S. Xiao, J. Chen, K. Luo, Y. Shao, D. Lian, and Z. Liu. Making text embedders few-shot learners. arXiv preprint arXiv:2409.15700, 2024. [24] Z. Liu, W. Lin, Y. Shi, and J. Zhao. A robustly optimized bert pre-training approach with post-training. In China national conference on Chinese computational linguistics, pages 471–484. Springer, 2021. [25] B. Mitra, F. Diaz, and N. Craswell. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international conference on world wide web, pages 1291–1299, 2017. [26] N. Muennighoff, S. Hongjin, L. Wang, N. Yang, F. Wei, T. Yu, A. Singh, and D. Kiela. Generative representational instruction tuning. In ICLR 2024 Workshop: How Far Are We From AGI. [27] N. Muennighoff, N. Tazi, L. Magne, and N. Reimers. Mteb: Massive text embedding benchmark. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2014–2037, 2023. [28] J. Ni, G. H. Abrego, N. Constant, J. Ma, K. Hall, D. Cer, and Y. Yang. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, 2022. [29] O. Skean, M. R. Arefin, D. Zhao, N. Patel, J. Naghiyev, Y. LeCun, and R. Shwartz-Ziv. Layer by layer: Uncovering hidden representations in language models. arXiv preprint arXiv:2502.02013, 2025. [30] J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012. [31] J. M. Springer, S. Kotha, D. Fried, G. Neubig, and A. Raghunathan. Repetition improves language model embeddings. arXiv preprint arXiv:2402.15449, 2024. [32] J. Su, J. Cao, W. Liu, and Y. Ou. Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316, 2021. [33] Y. Tang and Y. Yang. Pooling and attention: What are effective designs for llm-based embedding models? arXiv preprint arXiv:2409.02727, 2024. [34] R. Thirukovalluru and B. Dhingra. Geneol: Harnessing the generative power of llms for training-free sentence embeddings. arXiv preprint arXiv:2410.14635, 2024. [35] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023. [36] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, and F. Wei. Improving text embeddings with large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11897–11916, 2024. [37] B. Zhang, K. Chang, and C. Li. Simple techniques for enhancing sentence embeddings in generative language models. In International Conference on Intelligent Computing, pages 52–64. Springer, 2024.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656	-
dc.description.abstract	文本嵌入在各種自然語言處理任務中扮演關鍵角色。近期多項研究聚焦於以大型語言模型為基礎的文本嵌入方法，並在多種下游任務中展現出卓越的效能。然而，這些方法大多僅使用模型的最後一層來提取嵌入，而有研究指出，最後一層未必能提供最佳表示。為此，本研究提出一種結合多層資訊的品質感知池化方法，根據每一層的品質指標自動分配權重，以生成更具表現力的文本嵌入。實驗結果顯示，所提方法在多數下游任務中皆能提升性能，驗證其有效性與泛化能力。	zh_TW
dc.description.abstract	Text embeddings play a crucial role in various natural language processing tasks. Recent studies have focused on using large language models (LLMs) to generate high-quality embeddings, achieving remarkable performance on a wide range of downstream tasks. However, most existing methods rely solely on the final layer of the model to extract embeddings, despite evidence suggesting that the last layer may not always provide the most informative representation. To address this, we propose a quality-aware multi-layer pooling approach that integrates information from all layers and assigns weights based on layer-wise quality scores. Experimental results demonstrate that our method consistently improves performance across multiple downstream tasks, validating its effectiveness and generalizability.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-18T01:14:28Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-18T01:14:28Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iv Abstract v Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 6 2.1 Training-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3 Methodology 8 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Quality Metric Design for Text Embeddings . . . . . . . . . . . . . . 9 3.3 Bayesian Optimization with TPE . . . . . . . . . . . . . . . . . . . . 10 Chapter 4 Experiment 13 4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.3 Base Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.4 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.5 Statistical Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 The Upper Bound of Task-Agnostic Optimization . . . . . . . . . . . 19 Chapter 5 Ablation Study 21 5.1 Comparison of Different Objective Function . . . . . . . . . . . . . 21 5.2 Comparison of Different Samplers . . . . . . . . . . . . . . . . . . . 22 5.3 Comparison of Different Trial Numbers . . . . . . . . . . . . . . . . 23 Chapter 6 Conclusion 25 References 26	-
dc.language.iso	en	-
dc.subject	大型語言模型	zh_TW
dc.subject	下游任務	zh_TW
dc.subject	品質感知	zh_TW
dc.subject	多層池化	zh_TW
dc.subject	文本嵌入	zh_TW
dc.subject	Large Language Models	en
dc.subject	Downstream Tasks	en
dc.subject	Quality-Aware	en
dc.subject	Multi-Layer Pooling	en
dc.subject	Text Embedding	en
dc.title	利用品質感知的多層池化方法強化文本嵌入	zh_TW
dc.title	Enhancing Text Embeddings with Quality-Aware Multi-Layer Pooling	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	葉彌妍	zh_TW
dc.contributor.coadvisor	Mi-Yen Yeh	en
dc.contributor.oralexamcommittee	李政德;陳縕儂	zh_TW
dc.contributor.oralexamcommittee	Cheng-Te Li;Yun-Nung Chen	en
dc.subject.keyword	文本嵌入,大型語言模型,多層池化,品質感知,下游任務,	zh_TW
dc.subject.keyword	Text Embedding,Large Language Models,Multi-Layer Pooling,Quality-Aware,Downstream Tasks,	en
dc.relation.page	31	-
dc.identifier.doi	10.6342/NTU202503533	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-08-08	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資料科學學位學程	-
dc.date.embargo-lift	2025-08-18	-
顯示於系所單位：	資料科學學位學程

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf	575.44 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。