Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 文學院
  3. 圖書資訊學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/75116
Title: 中文查詢問句擴展之研究
A Study on Query Expansion for Chinese Information Retrieval
Authors: Ya-chen Chuang
莊雅蓁
Publication Year : 2000
Degree: 碩士
Abstract: 摘要
本研究主要探討下列三項議題:(1)自動建構之同義詞典對資訊檢索之輔助效益;(2)以何種索引典詞彙關係擴展查詢問句有最佳的效益;(3)同義詞典與索引典整合輔助的效果。由於詞彙資源取得不易,本研究以實驗檔資料庫為基礎,自動建構同義詞典,首先選定同義詞典詞彙來源之檔資料庫,使用斷詞的方式擷取詞彙,擷取檔中詞彙共現資訊計算詞彙關係,再依據詞彙相關程度聚引相關詞彙而產生詞彙群,進而組織為同義詞典。查詢問句擴展實驗,首先搜集原始查詢問句,再以不同的詞彙來源建構多組不同的查詢問句擴展模型,包括分別以同義詞典及索引典擴展以及整合兩種詞彙資源的擴展形式。查詢問句擴展檢索效益的評估由人工進行相關判斷,再依據判斷結果計算檢索結果的求準率。
研究結果顯示,以同義詞典詞彙群內詞彙數量較少的層次擴展查詢問句有較好的檢索效益,索引典各種詞彙關係擴展查詢問句的檢索結果則沒有顯著的差異,但大致以整合所有詞彙關係的擴展形式有較好的檢索效益,以同義詞典擴展再以索引典修正後的檢索效益略為提升,但再以索引典二次擴展的檢索效益反而降低。實驗亦發現自動建構的同義詞典內容受斷詞品質的優劣所影響,對查詢問句擴展的檢索效益而言,字串比對方式亦是重要的影響因素,可見查詢問句擴展的實驗除了擴展機制與形式的設計之外,還有許多影響實驗結果的因素,如斷詞、索引、字串比對、評估方法等,因此有關同義詞典建構與查詢問句擴展研究的未來發展,仍有許多複雜課題等待解決。
ABSTRACT
This thesis aims at three important issues for query expansion: whether the automatic constructed synonym dictionary could enhance the retrieval effectiveness, which relationship of thesaurus has the best performance, and the effectiveness of the integration of synonym dictionary and thesaurus. In order to proceed the experiments of query expansion, we have to develop a methodology for automatic construction of synonym dictionary. Firstly, the appropriate text corpus is selected, word segmentation is carried out, and words are clustered based on co-occurrence statistics. Then, the synonym dictionary is constructed hierarchically. In the experiments of query expansion, the queries are collected from various users and each query is expanded in different models, including expanding by either synonym dictionary or thesaurus or both. Finally, performance is evaluated in precision, which is calculated according to the manual relevance judgments.
The results show that query expansion using second level of synonym dictionary has better performance. Though the effects of different relationships prescribed in the thesaurus are similar, expanding by union of all relationships shows better performance. The model of first expanding query by synonym dictionary then modifying it by thesaurus has improved retrieval performance slightly, but the performance is decreased in further expansion. We also find that the correctness of word segmentation has a great impact on the quality of synonym dictionary. The mode of string mapping is another important factor, too. Therefore for the further study of synonym dictionary construction and query expansion, there are still many problems to be solved.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/75116
Fulltext Rights: 未授權
Appears in Collections:圖書資訊學系

Files in This Item:
There are no files associated with this item.
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved