Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46563| Title: | 應用主成份分析法辨識主題文件中極性人名 Bipolar Person Name Identification of Topic Documents Using Principal Component Analysis |
| Authors: | Chen-Yuan Wu 吳振源 |
| Advisor: | 陳建錦(Chien-Chin Chen) |
| Keyword: | 文件探勘,意見探勘,分群,極性人名辨識,主題文章,主成份分析法,時間趨勢圖, text mining,opinion mining,clustering,bipolar person name identification,topic documents,principal component analysis,activeness timeline graph, |
| Publication Year : | 2010 |
| Degree: | 碩士 |
| Abstract: | 在本論文中,我們提出了一個非監督式的方法來辨識一連串主題文章中的極性專有名詞。我們應用主成份分析法找出專有名詞的使用格式,並且證明了主成份分析法中的第一主成份裡的正負值是可以很正確的把極性專有名詞分類成極性團體。我們提出了兩個方法,分別為權重式關聯性係數公式與去除離題區塊來解決用字稀疏的問題。實驗證明我們所提出來的方法在辨識主題文章中的極性專有名詞是非常有用的,此外我們的方法可以應用到不同的語言。
應用完主成份分析法後,極性專有名詞會根據第一主成份內的元素符號分為正值與負值的兩個集團,為了幫助讀者有效率的了解兩個集團在這一連串主題文章中的發展,我們會分析這兩個集團的活動趨勢,實驗證明了我們的活動趨勢圖是可以很正確的反應真實狀況下兩個集團的活動趨勢的。 In this paper, we propose an unsupervised approach for identifying bipolar named entities in a set of topic documents. We employ principal component analysis (PCA) to discover bipolar word usage patterns of named entities in the documents and show that the signs of the entries in the principal eigenvector of PCA partition the named entities into bipolar groups spontaneously. We present two techniques, called off-topic block elimination and weighted correlation coefficient, to reduce the effect of data sparseness on person name bipolarization. Empirical evaluations demonstrate the efficacy of the proposed approach in identifying bipolar named entities of topics and the approach is language independent. After employing PCA, the sign of the entries in the principal eigenvector vector of PCA identifies bipolar groups of person names in a set of chronological topic documents. To help readers discover the evolution of bipolar groups, it would be useful to analyze the activeness trend of each polarity. We demonstrate that the changes in activeness accurately reflect the activeness trend of a bipolar group. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46563 |
| Fulltext Rights: | 有償授權 |
| Appears in Collections: | 資訊管理學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-99-1.pdf Restricted Access | 708.31 kB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
