Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 文學院
  3. 語言學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99323
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor謝舒凱zh_TW
dc.contributor.advisorShu-Kai Hsiehen
dc.contributor.author橘內每歌zh_TW
dc.contributor.authorMaika Kitsunaien
dc.date.accessioned2025-09-01T16:04:45Z-
dc.date.available2025-09-02-
dc.date.copyright2025-09-01-
dc.date.issued2025-
dc.date.submitted2025-08-11-
dc.identifier.citationPierpaolo Basile, Lucia Siciliani, Elio Musacchio, and Giovanni Semeraro. Exploring the word sense disambiguation capabilities of large language models. 2025. URL https://arxiv.org/abs/2503.08662.
Michele Bevilacqua, Tommaso Pasini, Alessandro Raganato, and Roberto Navigli. Re- cent trends in word sense disambiguation: A survey. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21 (Survey Track), pages 4330–4338, Montreal, 2021. International Joint Conferences on Artificial Intel- ligence Organization. URL https://doi.org/10.24963/ijcai.2021/593.
Jiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding, and Lianwen Jin. TongGu: Mastering Classical Chinese understanding with knowledge-grounded large language models. 2024. URL https://arxiv.org/abs/2407.03937.
Pei-Yi Chen. Modeling semantic change with word embeddings: A case study of jia. Master’s thesis, National Taiwan University, 2021. URL https://hdl.handle.net/ 11296/f2v84x.
Jacopo D’Abramo, Andrea Zugarini, and Paolo Torroni. Dynamic few-shot learning for knowledge graph question answering. 2024. URL https://arxiv.org/abs/2407. 01409.
Jesse de Does, Jan Niestadt, and Katrien Depuydt. Creating research environments with blacklab. In CLARIN in the Low Countries. Ubiquity Press, 2017. URL https:// ubiquitypress.com/chapters/e/10.5334/bbi.20.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, 2019. Association for Computational Linguistics. doi: 10. 18653/v1/N19-1423. URL https://aclanthology.org/N19-1423/.
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze. Dynamic contextualized word embeddings. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6970–6984, Online, 2021. As- sociation for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.542. URL https://aclanthology.org/2021.acl-long.542.
Renfen Hu, Shen Li, and Shichen Liang. Diachronic sense modeling with deep con- textualized word embeddings: An ecological view. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3899–3908, Florence, 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1379. URL https://aclanthology.org/P19-1379/.
Raihan Kibria, Sheikh Intiser Uddin Dipta, and Muhammad Abdullah Adnan. On func- tional competence of LLMs for linguistic disambiguation. In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 143–160, Miami, 2024. Association for Computational Linguistics. doi: 10.18653/v1/2024.conll-1.12. URL https://aclanthology.org/2024.conll-1.12/.
Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, Dublin, 2022. As- sociation for Computational Linguistics. doi: 10.18653/v1/2022.deelio-1.10. URL https://aclanthology.org/2022.deelio-1.10/.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space, 2013. URL https://arxiv.org/abs/1301.3781.
Roberto Navigli. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2):1–69, 2009. URL https://doi.org/10.1145/1459352.1459355.
Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe: Global vec- tors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, 2014. Association for Computational Linguistics. doi: 10.3115/v1/D14-1162. URL https: //aclanthology.org/D14-1162/.
Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019. URL https://cdn.openai.com/better-language-models/language_models_ are_unsupervised_multitask_learners.pdf.
Elizabeth C. Traugott and Richard B. Dasher. Regularity in Semantic Change. Cambridge University Press, 2001.
Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, and Lingpeng Kong. Compositional ex- emplars for in-context learning, 2023. URL https://arxiv.org/abs/2302.05698.
Fei Zhao, Taotian Pang, Zhen Wu, Zheng Ma, Shujian Huang, and Xinyu Dai. Dynamic demonstrations controller for in-context learning, 2024. URL https://arxiv.org/ abs/2310.00385.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99323-
dc.description.abstract本研究提出一種利用大型語言模型(LLM)與動態少量示例提示(Dynamic Few-Shot Prompting)進行歷史漢語語義變化分析的新方法。該方法基於上下文學習(In-Context Learning)框架,根據輸入語句動態選取最具相關性的示例,以實現詞義消歧(Word Sense Disambiguation, WSD)。為了適應古文中特有的詞彙與語境,我們整合了 CTEXT、CBETA、PTT 等多種資料來源,建構出一套涵蓋廣泛的歷史漢語語料庫。實驗結果顯示,在特定條件下,動態少量示例提示相較於零示例與固定示例提示方法展現出更穩定且優異的表現。動態提示的優勢在大型模型中尤為顯著,而在小型模型中則以零示例方法表現更佳。進一步將本方法應用於「家」一詞的語義演變分析,可視化結果揭示該詞在歷代中呈現出多種語義並存與競合的變遷現象,且不同時期呈現出語義間的平衡變化。zh_TW
dc.description.abstractThis study proposes a novel approach to word sense disambiguation (WSD) for analyzing semantic shifts in the Chinese language from antiquity to the modern era, leveraging general-purpose large language models (LLMs). Specifically, we employ decoder-based models (e.g., GPT) within the framework of in-context learning to design a dynamic few-shot prompting method. This method retrieves semantically relevant examples based on the input context and incorporates them into prompts.

Using an evaluation dataset constructed from the MoeDict dictionary, we conduct comparative experiments between a vector similarity-based approach using BERT and our prompt-based approach using GPT models. The results show that dynamic few-shot prompting achieves high accuracy and robustness in large-scale models, while zero-shot prompting performs better in smaller models.

Furthermore, we apply our method to a comprehensive historical Chinese corpus, constructed by integrating multiple sources such as CTEXT, CBETA, and PTT, and analyze diachronic usage examples of the word jia (家). The visualization of its semantic evolution reveals the coexistence and competition of multiple senses across different historical periods.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-01T16:04:45Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-09-01T16:04:45Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents摘要 .............................................................. i
Abstract ......................................................... iii
Contents ......................................................... v
List of Figures ................................................. vii
List of Tables .................................................. ix
List of Abbreviations ........................................... xi

Chapter 1 Introduction .................................... 1
  1.1 Background and Motivation .................... 1
  1.2 Research Questions ................................. 4

Chapter 2 Literature Review ......................... 5
  2.1 Word Sense Disambiguation .................... 5
  2.2 Word Embeddings .................................... 6
  2.3 Few-Shot Prompting with LLMs ................ 7
  2.4 Evaluation of LLMs in WSD Tasks ............. 9
  2.5 Semantic Shift ........................................... 9

Chapter 3 Corpus Construction for WSD Experiments ............ 11
  3.1 Corpus Overview ..................................... 12
  3.2 Periodization and Classification Criteria ....... 13
  3.3 Overview of Data Sources ......................... 13
  3.4 Index Construction and Metadata Annotation ......... 15
  3.5 Corpus Interface for Historical Chinese Texts ....... 17

Chapter 4 Experiments on WSD in Historical Chinese ....... 23
  4.1 Evaluation ............................................... 23
    4.1.1 Dataset ....................................... 24
    4.1.2 Filtering Criteria and Dataset Refinement ....... 26
  4.2 Workflow of the Proposed Method: Dynamic Few-Shot Prompting ......... 27
  4.3 Models and Selection Criteria ..................... 29
  4.4 Method Comparison ................................ 33
  4.5 Experimental Results .............................. 36
    4.5.1 Overview .................................. 36
    4.5.2 Discussion ................................. 37
    4.5.3 Agreement Rate Analysis Across Models and Methods ........ 39

Chapter 5 Application to Historical Data Corpus and Visualization of Semantic Change ........... 45
  5.1 Procedure ............................................. 47
  5.2 Results ................................................ 48

Chapter 6 Discussion ..................................... 53
  6.1 Limitations ........................................... 54
  6.2 Future Directions .................................. 55

References ..................................................... 57

Appendix A — Moedict API Result Example ............ 61
Appendix B — Overview of PTT Boards and Data Collection Periods for Corpus Construction ....... 65
-
dc.language.isoen-
dc.subject古典中文zh_TW
dc.subject少量樣本提示zh_TW
dc.subject詞義消歧zh_TW
dc.subject歷時語義變化zh_TW
dc.subject大型語言模型zh_TW
dc.subjectDiachronic Semantic Changeen
dc.subjectHistorical Chineseen
dc.subjectFew-Shot Promptingen
dc.subjectWord Sense Disambiguationen
dc.subjectLarge Language Modelsen
dc.title用於歷史中文詞義消歧的動態式少量示例提示法zh_TW
dc.titleDynamic Few-Shot Prompting for Word Sense Disambiguation in Historical Chineseen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee呂佳蓉;張瑜芸zh_TW
dc.contributor.oralexamcommitteeChia-Rung Lu;Yu-Yun Changen
dc.subject.keyword大型語言模型,詞義消歧,少量樣本提示,歷時語義變化,古典中文,zh_TW
dc.subject.keywordLarge Language Models,Word Sense Disambiguation,Few-Shot Prompting,Diachronic Semantic Change,Historical Chinese,en
dc.relation.page65-
dc.identifier.doi10.6342/NTU202503239-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2025-08-13-
dc.contributor.author-college文學院-
dc.contributor.author-dept語言學研究所-
dc.date.embargo-lift2030-08-05-
顯示於系所單位:語言學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
2.03 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved