請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38387
標題: | 改善以序列為基礎之文件檢索系統之有效性與彈性 Improving the Effectiveness and Scalability of a Sequence-Based Text Retrieval System |
作者: | Chun-Chih Huang 黃俊誌 |
指導教授: | 蔡益坤(Yih-Kuen Tsay) |
關鍵字: | 累加式更新,索引切割設計,資訊檢索,平行化反轉索引,平行化處理,文件檢索, Incremental Update,Index Partitioning Schemes,Information Retrieval,Parallel Inverted Index,Parallel Processing,Text Retrieval, |
出版年 : | 2005 |
學位: | 碩士 |
摘要: | The purpose of a text retrieval system is to locate documents from a large, textual
document collection that meet a user’s needs. The SIR system is such a system that is based on the sequence model. As it was designed and implemented as a sequential, rather than a parallel application, it becomes less efficient when the size of the data collection gets larger. Another drawback of the SIR system is that the index must be rebuilt entirely when the data collections are modified. Also, compared with other models, the query evaluation process of the sequence model is time consuming. In this thesis, we seek to make improvements that address these problems. To facilitate parallel query processing, we implement three kinds of index partitioning schemes in the system, and evalauete their load balancing characteristics. To improve the scalability of index building, we design and implement a mechanism that allows the SIR system to support incremental index updates. We also make other improvements such as support of queries with homophones and support of more types of token, that make the system more flexible. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38387 |
全文授權: | 有償授權 |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-94-1.pdf 目前未授權公開取用 | 521.86 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。