請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100605完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 陳銘憲 | zh_TW |
| dc.contributor.advisor | Ming-Syan Chen | en |
| dc.contributor.author | 陳怜均 | zh_TW |
| dc.contributor.author | Ling-Chun Chen | en |
| dc.date.accessioned | 2025-10-08T16:05:32Z | - |
| dc.date.available | 2025-10-09 | - |
| dc.date.copyright | 2025-10-08 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-08-11 | - |
| dc.identifier.citation | [1] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
[2] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakan- tan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. [3] R. Campos, G. Dias, A. M. Jorge, and A. Jatowt. Survey of temporal information retrieval and related applications. ACM Computing Surveys (CSUR), 47(2):1–41, 2014. [4] J. Chen, S. Xiao, P. Zhang, K. Luo, D. Lian, and Z. Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self- knowledge distillation. arXiv preprint arXiv:2402.03216, 2024. [5] K. Clark, U. Khandelwal, O. Levy, and C. D. Manning. What does bert look at? an analysis of bert’s attention. arXiv preprint arXiv:1906.04341, 2019. [6] B.Dhingra,J.R.Cole,J.M.Eisenschlos,D.Gillick,J.Eisenstein,andW.W.Cohen. Time-aware language models as temporal knowledge bases. Transactions of the Association for Computational Linguistics, 10:257–273, 2022. [7] M. Douze, A. Guzhva, C. Deng, J. Johnson, G. Szilvasy, P.-E. Mazaré, M. Lomeli, L. Hosseini, and H. Jégou. The faiss library. arXiv preprint arXiv:2401.08281, 2024. [8] F.Feng,X.He,X.Wang,C.Luo,Y.Liu,andT.-S.Chua.Temporalrelationalranking for stock prediction. ACM Transactions on Information Systems (TOIS), 37(2):1– 30, 2019. [9] A. Gade and J. Jetcheva. It’s about time: Incorporating temporality in retrieval aug- mented language models. arXiv preprint arXiv:2401.13222, 2024. [10] Y.Gao,Y.Xiong,X.Gao,K.Jia,J.Pan,Y.Bi,Y.Dai,J.Sun,H.Wang,andH.Wang. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2(1), 2023. [11] K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang. Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929– 3938. PMLR, 2020. [12] L. Hu, Y. Liu, N. Liu, M. Huai, L. Sun, and D. Wang. Seat: stable and explain- able attention. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 12907–12915, 2023. [13] G.Izacard,M.Caron,L.Hosseini,S.Riedel,P.Bojanowski,A.Joulin,andE.Grave. Unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118, 2021. [14] G. Izacard and E. Grave. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282, 2020. [15] G. Izacard, P. Lewis, M. Lomeli, L. Hosseini, F. Petroni, T. Schick, J. Dwivedi- Yu, A. Joulin, S. Riedel, and E. Grave. Atlas: Few-shot learning with retrieval augmented language models. Journal of Machine Learning Research, 24(251):1–43, 2023. [16] P. James. Timeml: Robust specification of event and temporal expressions in text. In Proceedings of the Fifth International Workshop on Computational Semantics (IWCS-5), 2003, 2003. [17] Z. Jia, A. Abujabal, R. Saha Roy, J. Strötgen, and G. Weikum. Tempquestions: A benchmark for temporal question answering. In Companion Proceedings of the The Web Conference 2018, pages 1057–1062, 2018. [18] F. Jiang. Identifying and mitigating vulnerabilities in llm-integrated applications. Master’s thesis, University of Washington, 2024. [19] J.Johnson,M.Douze,andH.Jégou.Billion-scalesimilaritysearchwithgpus.IEEE Transactions on Big Data, 7(3):535–547, 2019. [20] N. Kanhabua and K. Nørvåg. Using temporal language models for document dat- ing. In Joint European conference on machine learning and knowledge discovery in databases, pages 738–741. Springer, 2009. [21] V. Karpukhin, B. Oguz, S. Min, P. S. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih. Dense passage retrieval for open-domain question answering. In EMNLP (1), pages 6769–6781, 2020. [22] A. Kasuga and R. Yonetani. Cxsimulator: A user behavior simulation using llm embeddings for web-marketing campaign assessment. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 3817–3821, 2024. [23] A. Kulkarni, J. Teevan, K. M. Svore, and S. T. Dumais. Understanding temporal query dynamics. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 167–176, 2011. [24] A. Lazaridou, E. Gribovskaya, W. Stokowiec, and N. Grigorev. Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint arXiv:2203.05115, 2022. [25] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33:9459–9474, 2020. [26] T. Miller, S. Bethard, D. Dligach, and G. Savova. End-to-end clinical temporal in- formation extraction with multi-head attention. In Proceedings of the conference. Association for Computational Linguistics. Meeting, volume 2023, page 313, 2023. [27] D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, and J. Dean. Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350, 2021. [28] J. Pustejovsky, J. M. Castano, R. Ingria, R. Sauri, R. J. Gaizauskas, A. Setzer, G. Katz, and D. R. Radev. Timeml: Robust specification of event and temporal expressions in text. New directions in question answering, 3:28–34, 2003. [29] X. Qian, Y. Zhang, Y. Zhao, B. Zhou, X. Sui, L. Zhang, and K. Song. Timer4: Time- aware retrieval-augmented large language models for temporal knowledge graph question answering. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 6942–6952, 2024. [30] S. Robertson, H. Zaragoza, et al. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389, 2009. [31] A. Saxena, S. Chakrabarti, and P. Talukdar. Question answering over temporal knowledge graphs. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6663–6676, 2021. [32] T. Schick and H. Schütze. It's not just size that matters: Small language mod- els are also few-shot learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2339–2352, 2021. [33] K. Shuster, S. Poff, M. Chen, D. Kiela, and J. Weston. Retrieval augmentation re- duces hallucination in conversation. arXiv preprint arXiv:2104.07567, 2021. [34] K. Song, X. Tan, T. Qin, J. Lu, and T.-Y. Liu. Mpnet: Masked and permuted pre- training for language understanding. Advances in neural information processing systems, 33:16857–16867, 2020. [35] J. Strötgen, T. Bögel, J. Zell, A. Armiti, T. Van Canh, and M. Gertz. Extending hei- deltime for temporal expressions referring to historic dates. In LREC, pages 2390– 2397, 2014. [36] J. Strötgen and M. Gertz. Multilingual and cross-domain temporal tagging. Language Resources and Evaluation, 47:269–298, 2013. [37] A. J. Thirunavukarasu, D. S. J. Ting, K. Elangovan, L. Gutierrez, T. F. Tan, and D. S. W. Ting. Large language models in medicine. Nature medicine, 29(8):1930– 1940, 2023. [38] O. Topsakal and T. C. Akinci. Creating large language model applications utilizing langchain: A primer on developing llm apps fast. In International conference on applied engineering and natural sciences, volume 1, pages 1050–1056, 2023. [39] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023. [40] E. M. Voorhees, D. M. Tice, et al. The trec-8 question answering track evaluation. In TREC, volume 1999, page 82, 1999. [41] T. Vu, M. Iyyer, X. Wang, N. Constant, J. Wei, J. Wei, C. Tar, Y.-H. Sung, D. Zhou, Q. Le, et al. Freshllms: Refreshing large language models with search engine aug- mentation. arXiv preprint arXiv:2310.03214, 2023. [42] G.Wang,S.Cheng,X.Zhan,X.Li,S.Song,andY.Liu.Openchat:Advancingopen- source language models with mixed-quality data. arXiv preprint arXiv:2309.11235, 2023. [43] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, et al. Transformers: State-of-the-art natural language pro- cessing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pages 38–45, 2020. [44] X. Yu, Z. Chen, and Y. Lu. Harnessing llms for temporal data-a study on explain- able financial time series forecasting. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 739– 753, 2023. [45] P. Zhang, G. Zeng, T. Wang, and W. Lu. Tinyllama: An open-source small language model. arXiv preprint arXiv:2401.02385, 2024. [46] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, et al. A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2), 2023. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100605 | - |
| dc.description.abstract | 大型語言模型(Large Language Models, LLMs)在推理與知識密集型任務中展現出卓越表現,然而其靜態的預訓練特性使其難以處理快速變化或具領域特定性的知識。檢索增強生成(Retrieval-Augmented Generation, RAG)透過引入動態檢索的證據以強化 LLM 的輸出內容,進而提升其事實正確性並降低幻覺現象的發生。然而,傳統的 RAG 在面對具有時間敏感性的查詢時表現不佳,特別是在文件中出現模糊或間接的時間表述(例如「幾年之後」)時。這將導致時間錯置(Temporal Misalignment)現象,即檢索結果在主題上雖然相關,卻在時間上不正確。為了解決此問題,我們提出 DeFuzzRAG,一個提升 RAG 時間穩定性的輕量級架構。DeFuzzRAG 採用小型本地語言模型來從模糊的時間表達中推斷具體的時間範圍,並運用以時間中介資料為基礎的過濾機制,使檢索結果重新對齊查詢的時間意圖。我們在一套模糊化查詢基準上進行實驗,結果顯示 DeFuzzRAG 能顯著提升檢索準確率,將命中率提高 15.7%,同時維持運算效率並具備模型無關的整合能力。實驗結果凸顯時間推理在 RAG 架構中的重要性,並確立 DeFuzzRAG 作為一種可實用、可即插即用的解決方案,適用於部署在現實應用情境中的時間魯棒 LLM 系統。 | zh_TW |
| dc.description.abstract | Large Language Models (LLMs) have achieved remarkable success across reasoning and knowledge-intensive tasks, yet their static pretraining leaves them unable to handle rapidly evolving or domain-specific knowledge. Retrieval-Augmented Generation (RAG) addresses this by grounding LLM outputs in dynamically retrieved evidence, improving factual accuracy and reducing hallucinations. However, standard RAG pipelines struggle with temporally sensitive queries, especially when documents contain fuzzy or indirect time expressions (e.g., “a few years later”). This leads to Temporal Misalignment, where topically relevant but temporally incorrect results are retrieved. To overcome this, we propose DeFuzzRAG, a lightweight framework that enhances temporal robustness in RAG. DeFuzzRAG employs a small local language model to infer concrete time scopes from vague expressions and applies metadata-based filtering to realign retrieval with the query’s temporal intent. Experiments on a benchmark of fuzzified queries demonstrate that DeFuzzRAG substantially improves retrieval accuracy, raising Hit Rate by 15.7% while maintaining efficiency and model-agnostic integration. Our findings highlight the importance of temporal reasoning in RAG and establish DeFuzzRAG as a practical, plug-and-play solution for deploying temporally robust LLM systems in real-world settings. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-10-08T16:05:32Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-10-08T16:05:32Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgements i
摘要 iii Abstract v Contents vii List of Figures ix List of Tables x 1 Introduction 1 2 Related Work 5 2.1 Temporally-Aware RAG......................... 5 2.2 Temporal Data Mining.......................... 6 3 Preliminary 7 4 Methodology 11 4.1 Temporal Intent Extraction....................... 12 4.2 Query-Aware Temporal Metadata Generation . . . . . . . . . . . . . 13 4.3 Temporal Alignment Filtering ..................... 14 4.4 Illustrative Example .......................... 15 4.4.1 Step 1:Temporal Intent Extraction. . . . . . . . . . . . . . . 15 4.4.2 Step 2: Query-Aware Temporal Metadata Construction. . . . . 15 4.4.3 Step 3:Temporal Alignment Filtering.. . . . . . . . . . . . . 16 5 Experiments 19 5.1 TempFuzz-QA ............................. 19 5.2 Experimental Setup........................... 20 6 Results And Analysis 23 6.1 Quantative Results ........................... 23 6.2 Extending to Spatial Fuzziness..................... 25 7 Conclusion 27 References 29 | - |
| dc.language.iso | en | - |
| dc.subject | 檢索增強生成,資訊檢索,模糊時間表達,時間感知型檢索,時間魯棒性,時間約束條件,稠密向量檢索,自然語言處理,查詢理解, | zh_TW |
| dc.subject | Retrieval-Augmented Generation,Information Retrieval,Fuzzy Temporal Expressions,Time-Aware Retrieval,Temporal Robustness,Temporal Constraints,Dense Retrieval,Natural Language Processing,Query Understanding, | en |
| dc.title | DeFuzzRAG:處理模糊時間表達以提升檢索增強生成的時間穩定性 | zh_TW |
| dc.title | DeFuzzRAG: Handling Fuzzy Time Expressions for Temporal Robustness in Retrieval-Augmented Generation | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 楊得年;曹昱;吳齊人 | zh_TW |
| dc.contributor.oralexamcommittee | De-Nian Yang;Yu Tsao;Chi-Jen Wu | en |
| dc.relation.page | 35 | - |
| dc.identifier.doi | 10.6342/NTU202503356 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-08-13 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 電信工程學研究所 | - |
| dc.date.embargo-lift | 2025-10-09 | - |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 535.84 kB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
