Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林守德zh_TW
dc.contributor.advisorShou-De Linen
dc.contributor.author蕭襄zh_TW
dc.contributor.authorHsiang Hsiaoen
dc.date.accessioned2025-08-18T01:14:28Z-
dc.date.available2025-08-18-
dc.date.copyright2025-08-15-
dc.date.issued2025-
dc.date.submitted2025-08-04-
dc.identifier.citation[1] A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182–197, 2002.
[2] Qwen2 technical report. 2024.
[3] E. Agirre, D. Cer, M. Diab, and A. Gonzalez-Agirre. Semeval-2012 task 6: A pilot on semantic textual similarity.* sem 2012: The first joint conference on lexical and computational semantics—. In Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Montréal, QC, Canada, pages 7–8, 2012.
[4] K. K. Agrawal, A. K. Mondal, A. Ghosh, and B. Richards. α-req: Assessing representation quality in self-supervised learning by measuring eigenspectrum decay. Advances in Neural Information Processing Systems, 35:17626–17638, 2022.
[5] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparameter optimization framework. In The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631, 2019.
[6] P. BehnamGhader, V. Adlakha, M. Mosbach, D. Bahdanau, N. Chapados, and S. Reddy. Llm2vec: Large language models are secretly powerful text encoders. arXiv preprint arXiv:2404.05961, 2024.
[7] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 24, 2011.
[8] D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055, 2017.
[9] Y.-S. Chuang, R. Dangovski, H. Luo, Y. Zhang, S. Chang, M. Soljačić, S.-W. Li, S. Yih, Y. Kim, and J. Glass. Diffcse: Difference-based contrastive learning for sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4207–4218, 2022.
[10] DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025.
[11] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019.
[12] S. Gao, S. Dou, Q. Zhang, and X.-J. Huang. Kernel-whitening: Overcome dataset bias with isotropic sentence embedding. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4112–4122, 2022.
[13] T. Gao, X. Yao, and D. Chen. Simcse: Simple contrastive learning of sentence embeddings. In EMNLP 2021-2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2021.
[14] Q. Garrido, R. Balestriero, L. Najman, and Y. Lecun. Rankme: Assessing the downstream performance of pretrained self-supervised representations by their rank. In International conference on machine learning, pages 10929–10974. PMLR, 2023.
[15] B. He and M. Ozay. Exploring the gap between collapsed & whitened features in self-supervised learning. In International Conference on Machine Learning, pages 8613–8634. PMLR, 2022.
[16] G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bojanowski, A. Joulin, and E. Grave. Unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118, 2021.
[17] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed. Mistral 7b, 2023.
[18] T. Jiang, S. Huang, Z. Luan, D. Wang, and F. Zhuang. Scaling sentence embeddings with large language models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3182–3196, 2024.
[19] T. Jiang, J. Jiao, S. Huang, Z. Zhang, D. Wang, F. Zhuang, F. Wei, H. Huang, D. Deng, and Q. Zhang. Promptbert: Improving bert sentence embeddings with prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8826–8837, 2022.
[20] P. Ke, H. Ji, S. Liu, X. Zhu, and M. Huang. Sentilare: Sentiment-aware language representation learning with linguistic knowledge. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6975–6988, 2020.
[21] C. Lee, R. Roy, M. Xu, J. Raiman, M. Shoeybi, B. Catanzaro, and W. Ping. Nvembed: Improved techniques for training llms as generalist embedding models. arXiv preprint arXiv:2405.17428, 2024.
[22] Y. Lei, D. Wu, T. Zhou, T. Shen, Y. Cao, C. Tao, and A. Yates. Meta-task prompting elicits embeddings from large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10141–10157, 2024.
[23] C. Li, M. Qin, S. Xiao, J. Chen, K. Luo, Y. Shao, D. Lian, and Z. Liu. Making text embedders few-shot learners. arXiv preprint arXiv:2409.15700, 2024.
[24] Z. Liu, W. Lin, Y. Shi, and J. Zhao. A robustly optimized bert pre-training approach with post-training. In China national conference on Chinese computational linguistics, pages 471–484. Springer, 2021.
[25] B. Mitra, F. Diaz, and N. Craswell. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international conference on world wide web, pages 1291–1299, 2017.
[26] N. Muennighoff, S. Hongjin, L. Wang, N. Yang, F. Wei, T. Yu, A. Singh, and D. Kiela. Generative representational instruction tuning. In ICLR 2024 Workshop: How Far Are We From AGI.
[27] N. Muennighoff, N. Tazi, L. Magne, and N. Reimers. Mteb: Massive text embedding benchmark. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2014–2037, 2023.
[28] J. Ni, G. H. Abrego, N. Constant, J. Ma, K. Hall, D. Cer, and Y. Yang. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, 2022.
[29] O. Skean, M. R. Arefin, D. Zhao, N. Patel, J. Naghiyev, Y. LeCun, and R. Shwartz-Ziv. Layer by layer: Uncovering hidden representations in language models. arXiv preprint arXiv:2502.02013, 2025.
[30] J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012.
[31] J. M. Springer, S. Kotha, D. Fried, G. Neubig, and A. Raghunathan. Repetition improves language model embeddings. arXiv preprint arXiv:2402.15449, 2024.
[32] J. Su, J. Cao, W. Liu, and Y. Ou. Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316, 2021.
[33] Y. Tang and Y. Yang. Pooling and attention: What are effective designs for llm-based embedding models? arXiv preprint arXiv:2409.02727, 2024.
[34] R. Thirukovalluru and B. Dhingra. Geneol: Harnessing the generative power of llms for training-free sentence embeddings. arXiv preprint arXiv:2410.14635, 2024.
[35] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
[36] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, and F. Wei. Improving text embeddings with large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11897–11916, 2024.
[37] B. Zhang, K. Chang, and C. Li. Simple techniques for enhancing sentence embeddings in generative language models. In International Conference on Intelligent Computing, pages 52–64. Springer, 2024.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98656-
dc.description.abstract文本嵌入在各種自然語言處理任務中扮演關鍵角色。近期多項研究聚焦於以大型語言模型為基礎的文本嵌入方法,並在多種下游任務中展現出卓越的效能。然而,這些方法大多僅使用模型的最後一層來提取嵌入,而有研究指出,最後一層未必能提供最佳表示。為此,本研究提出一種結合多層資訊的品質感知池化方法,根據每一層的品質指標自動分配權重,以生成更具表現力的文本嵌入。實驗結果顯示,所提方法在多數下游任務中皆能提升性能,驗證其有效性與泛化能力。zh_TW
dc.description.abstractText embeddings play a crucial role in various natural language processing tasks. Recent studies have focused on using large language models (LLMs) to generate high-quality embeddings, achieving remarkable performance on a wide range of downstream tasks. However, most existing methods rely solely on the final layer of the model to extract embeddings, despite evidence suggesting that the last layer may not always provide the most informative representation. To address this, we propose a quality-aware multi-layer pooling approach that integrates information from all layers and assigns weights based on layer-wise quality scores. Experimental results demonstrate that our method consistently improves performance across multiple downstream tasks, validating its effectiveness and generalizability.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-18T01:14:28Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-18T01:14:28Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iv
Abstract v
Contents vi
List of Figures viii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Work 6
2.1 Training-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 3 Methodology 8
3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Quality Metric Design for Text Embeddings . . . . . . . . . . . . . . 9
3.3 Bayesian Optimization with TPE . . . . . . . . . . . . . . . . . . . . 10
Chapter 4 Experiment 13
4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.2 Training-Free Methods . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.3 Base Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.4 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.5 Statistical Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 The Upper Bound of Task-Agnostic Optimization . . . . . . . . . . . 19
Chapter 5 Ablation Study 21
5.1 Comparison of Different Objective Function . . . . . . . . . . . . . 21
5.2 Comparison of Different Samplers . . . . . . . . . . . . . . . . . . . 22
5.3 Comparison of Different Trial Numbers . . . . . . . . . . . . . . . . 23
Chapter 6 Conclusion 25
References 26
-
dc.language.isoen-
dc.subject大型語言模型zh_TW
dc.subject下游任務zh_TW
dc.subject品質感知zh_TW
dc.subject多層池化zh_TW
dc.subject文本嵌入zh_TW
dc.subjectLarge Language Modelsen
dc.subjectDownstream Tasksen
dc.subjectQuality-Awareen
dc.subjectMulti-Layer Poolingen
dc.subjectText Embeddingen
dc.title利用品質感知的多層池化方法強化文本嵌入zh_TW
dc.titleEnhancing Text Embeddings with Quality-Aware Multi-Layer Poolingen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.coadvisor葉彌妍zh_TW
dc.contributor.coadvisorMi-Yen Yehen
dc.contributor.oralexamcommittee李政德;陳縕儂zh_TW
dc.contributor.oralexamcommitteeCheng-Te Li;Yun-Nung Chenen
dc.subject.keyword文本嵌入,大型語言模型,多層池化,品質感知,下游任務,zh_TW
dc.subject.keywordText Embedding,Large Language Models,Multi-Layer Pooling,Quality-Aware,Downstream Tasks,en
dc.relation.page31-
dc.identifier.doi10.6342/NTU202503533-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-08-08-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資料科學學位學程-
dc.date.embargo-lift2025-08-18-
顯示於系所單位:資料科學學位學程

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf575.44 kBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved