Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28123
Title: | 結合文件分類及分群之術語組織技術 Organization of Term Associations through a Combination of Text Classification and Clustering |
Authors: | Tsung-Pei Chou 周宗霈 |
Advisor: | 簡立峰(Lee-Feng Chien) |
Co-Advisor: | 王柏堯(Bow-Yaw Wang) |
Keyword: | 術語組織,分群技術,分類技術,全球資訊網, Term Organizing,Clustering,Classification,World Wide Web, |
Publication Year : | 2007 |
Degree: | 碩士 |
Abstract: | 術語是句子、文章精煉之後的詞彙,是資訊呈現的基本單位,常做為概念的導引,術語的組織有助於使用者了解主題,進而快速掌握重要的資訊。當術語來源及使用者需求不固定時,傳統的術語組織方法難以直接滿足使用者的需求,分群結果缺乏語意上的解釋,分類方法則需耗費大量的人力。
在本論文中,我們提出一結合分類及分群之術語組織方式,我們運用分群方法發掘重要的術語主題,幫助使用者快速掌握整個術語中的重要概念,使用者可依此決定術語類別並從分群結果中擷取訓練語料,最終全部術語以分類方法進行組織。分群及分類方法為一反覆交替過程,過程中可不斷接受使用者回饋,而持續修正組織結果。 此方式使得術語組織的過程大為簡化,且能考量不同使用者的偏好,依使用者自訂的類別組織術語。我們從初步實驗獲得的結果發現,本研究所提方法能使組織結果更為理想。 Terms, short and meaningful word string which extracted from sentences and articles, can be the basic unit of information and guideline of concept. The organization of terms can help user understand topics and therefore grasp the key point quickly. When the sources of terms and requests of user are varied, conventional methods, clustering and classification, cannot satisfy users. The clustered results are lack of comprehensive explanations and the classification method need much manual work. In this thesis, we develop an approach to combine the clustering and classification methods on term organization which provide a more comprehensive overview on terms. We use clustering method to extract the main topic and then user can decide the target classes from clustering results. Finally, all terms will be classified to their belonging classes. The clustering and classification methods are iterative to achieve a better performance. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28123 |
Fulltext Rights: | 有償授權 |
Appears in Collections: | 資訊管理學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-96-1.pdf Restricted Access | 624.63 kB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.