請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 林守德 | zh_TW |
| dc.contributor.advisor | Shou-De Lin | en |
| dc.contributor.author | 蕭襄 | zh_TW |
| dc.contributor.author | Hsiang Hsiao | en |
| dc.date.accessioned | 2025-08-18T01:14:28Z | - |
| dc.date.available | 2025-08-18 | - |
| dc.date.copyright | 2025-08-15 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-08-04 | - |
| dc.identifier.citation | [1] A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182–197, 2002.
[2] Qwen2 technical report. 2024. [3] E. Agirre, D. Cer, M. Diab, and A. Gonzalez-Agirre. Semeval-2012 task 6: A pilot on semantic textual similarity.* sem 2012: The first joint conference on lexical and computational semantics—. In Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Montréal, QC, Canada, pages 7–8, 2012. [4] K. K. Agrawal, A. K. Mondal, A. Ghosh, and B. Richards. α-req: Assessing representation quality in self-supervised learning by measuring eigenspectrum decay. Advances in Neural Information Processing Systems, 35:17626–17638, 2022. [5] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparameter optimization framework. In The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631, 2019. [6] P. BehnamGhader, V. Adlakha, M. Mosbach, D. Bahdanau, N. Chapados, and S. Reddy. Llm2vec: Large language models are secretly powerful text encoders. arXiv preprint arXiv:2404.05961, 2024. [7] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 24, 2011. [8] D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055, 2017. [9] Y.-S. Chuang, R. Dangovski, H. Luo, Y. Zhang, S. Chang, M. Soljačić, S.-W. Li, S. Yih, Y. Kim, and J. Glass. Diffcse: Difference-based contrastive learning for sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4207–4218, 2022. [10] DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025. [11] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019. [12] S. Gao, S. Dou, Q. Zhang, and X.-J. Huang. Kernel-whitening: Overcome dataset bias with isotropic sentence embedding. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4112–4122, 2022. [13] T. Gao, X. Yao, and D. Chen. Simcse: Simple contrastive learning of sentence embeddings. In EMNLP 2021-2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2021. [14] Q. Garrido, R. Balestriero, L. Najman, and Y. Lecun. Rankme: Assessing the downstream performance of pretrained self-supervised representations by their rank. In International conference on machine learning, pages 10929–10974. PMLR, 2023. [15] B. He and M. Ozay. Exploring the gap between collapsed & whitened features in self-supervised learning. In International Conference on Machine Learning, pages 8613–8634. PMLR, 2022. [16] G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bojanowski, A. Joulin, and E. Grave. Unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118, 2021. [17] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed. Mistral 7b, 2023. [18] T. Jiang, S. Huang, Z. Luan, D. Wang, and F. Zhuang. Scaling sentence embeddings with large language models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3182–3196, 2024. [19] T. Jiang, J. Jiao, S. Huang, Z. Zhang, D. Wang, F. Zhuang, F. Wei, H. Huang, D. Deng, and Q. Zhang. Promptbert: Improving bert sentence embeddings with prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8826–8837, 2022. [20] P. Ke, H. Ji, S. Liu, X. Zhu, and M. Huang. Sentilare: Sentiment-aware language representation learning with linguistic knowledge. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6975–6988, 2020. [21] C. Lee, R. Roy, M. Xu, J. Raiman, M. Shoeybi, B. Catanzaro, and W. Ping. Nvembed: Improved techniques for training llms as generalist embedding models. arXiv preprint arXiv:2405.17428, 2024. [22] Y. Lei, D. Wu, T. Zhou, T. Shen, Y. Cao, C. Tao, and A. Yates. Meta-task prompting elicits embeddings from large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10141–10157, 2024. [23] C. Li, M. Qin, S. Xiao, J. Chen, K. Luo, Y. Shao, D. Lian, and Z. Liu. Making text embedders few-shot learners. arXiv preprint arXiv:2409.15700, 2024. [24] Z. Liu, W. Lin, Y. Shi, and J. Zhao. A robustly optimized bert pre-training approach with post-training. In China national conference on Chinese computational linguistics, pages 471–484. Springer, 2021. [25] B. Mitra, F. Diaz, and N. Craswell. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international conference on world wide web, pages 1291–1299, 2017. [26] N. Muennighoff, S. Hongjin, L. Wang, N. Yang, F. Wei, T. Yu, A. Singh, and D. Kiela. Generative representational instruction tuning. In ICLR 2024 Workshop: How Far Are We From AGI. [27] N. Muennighoff, N. Tazi, L. Magne, and N. Reimers. Mteb: Massive text embedding benchmark. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2014–2037, 2023. [28] J. Ni, G. H. Abrego, N. Constant, J. Ma, K. Hall, D. Cer, and Y. Yang. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, 2022. [29] O. Skean, M. R. Arefin, D. Zhao, N. Patel, J. Naghiyev, Y. LeCun, and R. Shwartz-Ziv. Layer by layer: Uncovering hidden representations in language models. arXiv preprint arXiv:2502.02013, 2025. [30] J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012. [31] J. M. Springer, S. Kotha, D. Fried, G. Neubig, and A. Raghunathan. Repetition improves language model embeddings. arXiv preprint arXiv:2402.15449, 2024. [32] J. Su, J. Cao, W. Liu, and Y. Ou. Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316, 2021. [33] Y. Tang and Y. Yang. Pooling and attention: What are effective designs for llm-based embedding models? arXiv preprint arXiv:2409.02727, 2024. [34] R. Thirukovalluru and B. Dhingra. Geneol: Harnessing the generative power of llms for training-free sentence embeddings. arXiv preprint arXiv:2410.14635, 2024. [35] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023. [36] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, and F. Wei. Improving text embeddings with large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11897–11916, 2024. [37] B. Zhang, K. Chang, and C. Li. Simple techniques for enhancing sentence embeddings in generative language models. In International Conference on Intelligent Computing, pages 52–64. Springer, 2024. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656 | - |
| dc.description.abstract | 文本嵌入在各種自然語言處理任務中扮演關鍵角色。近期多項研究聚焦於以大型語言模型為基礎的文本嵌入方法,並在多種下游任務中展現出卓越的效能。然而,這些方法大多僅使用模型的最後一層來提取嵌入,而有研究指出,最後一層未必能提供最佳表示。為此,本研究提出一種結合多層資訊的品質感知池化方法,根據每一層的品質指標自動分配權重,以生成更具表現力的文本嵌入。實驗結果顯示,所提方法在多數下游任務中皆能提升性能,驗證其有效性與泛化能力。 | zh_TW |
| dc.description.abstract | Text embeddings play a crucial role in various natural language processing tasks. Recent studies have focused on using large language models (LLMs) to generate high-quality embeddings, achieving remarkable performance on a wide range of downstream tasks. However, most existing methods rely solely on the final layer of the model to extract embeddings, despite evidence suggesting that the last layer may not always provide the most informative representation. To address this, we propose a quality-aware multi-layer pooling approach that integrates information from all layers and assigns weights based on layer-wise quality scores. Experimental results demonstrate that our method consistently improves performance across multiple downstream tasks, validating its effectiveness and generalizability. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-18T01:14:28Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-08-18T01:14:28Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員會審定書 i
誌謝 ii 摘要 iv Abstract v Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 6 2.1 Training-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3 Methodology 8 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Quality Metric Design for Text Embeddings . . . . . . . . . . . . . . 9 3.3 Bayesian Optimization with TPE . . . . . . . . . . . . . . . . . . . . 10 Chapter 4 Experiment 13 4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.3 Base Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.4 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.5 Statistical Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 The Upper Bound of Task-Agnostic Optimization . . . . . . . . . . . 19 Chapter 5 Ablation Study 21 5.1 Comparison of Different Objective Function . . . . . . . . . . . . . 21 5.2 Comparison of Different Samplers . . . . . . . . . . . . . . . . . . . 22 5.3 Comparison of Different Trial Numbers . . . . . . . . . . . . . . . . 23 Chapter 6 Conclusion 25 References 26 | - |
| dc.language.iso | en | - |
| dc.subject | 大型語言模型 | zh_TW |
| dc.subject | 下游任務 | zh_TW |
| dc.subject | 品質感知 | zh_TW |
| dc.subject | 多層池化 | zh_TW |
| dc.subject | 文本嵌入 | zh_TW |
| dc.subject | Large Language Models | en |
| dc.subject | Downstream Tasks | en |
| dc.subject | Quality-Aware | en |
| dc.subject | Multi-Layer Pooling | en |
| dc.subject | Text Embedding | en |
| dc.title | 利用品質感知的多層池化方法強化文本嵌入 | zh_TW |
| dc.title | Enhancing Text Embeddings with Quality-Aware Multi-Layer Pooling | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.coadvisor | 葉彌妍 | zh_TW |
| dc.contributor.coadvisor | Mi-Yen Yeh | en |
| dc.contributor.oralexamcommittee | 李政德;陳縕儂 | zh_TW |
| dc.contributor.oralexamcommittee | Cheng-Te Li;Yun-Nung Chen | en |
| dc.subject.keyword | 文本嵌入,大型語言模型,多層池化,品質感知,下游任務, | zh_TW |
| dc.subject.keyword | Text Embedding,Large Language Models,Multi-Layer Pooling,Quality-Aware,Downstream Tasks, | en |
| dc.relation.page | 31 | - |
| dc.identifier.doi | 10.6342/NTU202503533 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-08-08 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資料科學學位學程 | - |
| dc.date.embargo-lift | 2025-08-18 | - |
| 顯示於系所單位: | 資料科學學位學程 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 575.44 kB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
