請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46563完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 陳建錦(Chien-Chin Chen) | |
| dc.contributor.author | Chen-Yuan Wu | en |
| dc.contributor.author | 吳振源 | zh_TW |
| dc.date.accessioned | 2021-06-15T05:15:48Z | - |
| dc.date.available | 2010-07-28 | |
| dc.date.copyright | 2010-07-28 | |
| dc.date.issued | 2010 | |
| dc.date.submitted | 2010-07-21 | |
| dc.identifier.citation | [1] Andrea Esuli and Fabrizio Sebastiani. 2006. SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining. In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC2006).
[2] Ao Feng and James Allan. 2007. Finding and Linking Incidents in News. In Proceedings of the 16th ACM Conference on information and knowledge management, pages 821-830. [3] Chien-Chin Chen and Meng-Chang Chen. 2008. TSCAN: a novel method for topic summarization and content anatomy. In Proceedings of the 31st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pages 579-586. [4] Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press. [5] Hiroshi Kanayama and Tetsuya Nasukawa. 2006. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 355-363. [6] Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 5, pages 604-632. [7] Lawrence E. Spence, Arnold J. Insel, and Stephen H. Friedberg. 2000. Elementary Linear Algebra, A Matrix Approach. Prentice Hall. [8] Lindsay I. Smith. 2002. A Tutorial on Principal Components Analysis. Cornell University. [9] Murthy Ganapathibhotla and Bing Liu. 2008. Mining Opinions in Comparative Sentences. In Proceedings of the 22nd International Conference on Computational Linguistics, pages 241-248. [10] Peter D. Turney and Michael L. Littman. 2003. Measuring Praise and Criticism: Inference of Semantic Orientation from Association. ACM Transactions on Information Systems (TOIS), pages 315-346. [11] Qiaozhu Mei and ChengXiang Zhai. 2005. Discovering Evolutionary Theme Patterns from Text – An Exploration of Temporal Text Mining. In Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 198-207. [12] Ramesh Nallapati, Ao Feng, Fuchun Peng, James Allan. 2004. Event Threading within News Topics. In Proceedings of the thirteenth ACM international conference on Information and know-ledge management, pages 446-453. [13] Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the Semantic Orientation of Adjectives. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, pages 174-181. [14] Yihong Gong and Xin Liu. 2001. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19-25. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46563 | - |
| dc.description.abstract | 在本論文中,我們提出了一個非監督式的方法來辨識一連串主題文章中的極性專有名詞。我們應用主成份分析法找出專有名詞的使用格式,並且證明了主成份分析法中的第一主成份裡的正負值是可以很正確的把極性專有名詞分類成極性團體。我們提出了兩個方法,分別為權重式關聯性係數公式與去除離題區塊來解決用字稀疏的問題。實驗證明我們所提出來的方法在辨識主題文章中的極性專有名詞是非常有用的,此外我們的方法可以應用到不同的語言。
應用完主成份分析法後,極性專有名詞會根據第一主成份內的元素符號分為正值與負值的兩個集團,為了幫助讀者有效率的了解兩個集團在這一連串主題文章中的發展,我們會分析這兩個集團的活動趨勢,實驗證明了我們的活動趨勢圖是可以很正確的反應真實狀況下兩個集團的活動趨勢的。 | zh_TW |
| dc.description.abstract | In this paper, we propose an unsupervised approach for identifying bipolar named entities in a set of topic documents. We employ principal component analysis (PCA) to discover bipolar word usage patterns of named entities in the documents and show that the signs of the entries in the principal eigenvector of PCA partition the named entities into bipolar groups spontaneously. We present two techniques, called off-topic block elimination and weighted correlation coefficient, to reduce the effect of data sparseness on person name bipolarization. Empirical evaluations demonstrate the efficacy of the proposed approach in identifying bipolar named entities of topics and the approach is language independent.
After employing PCA, the sign of the entries in the principal eigenvector vector of PCA identifies bipolar groups of person names in a set of chronological topic documents. To help readers discover the evolution of bipolar groups, it would be useful to analyze the activeness trend of each polarity. We demonstrate that the changes in activeness accurately reflect the activeness trend of a bipolar group. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T05:15:48Z (GMT). No. of bitstreams: 1 ntu-99-R97725035-1.pdf: 725314 bytes, checksum: 033efcc8b2e05e93fab35ed1a53721bb (MD5) Previous issue date: 2010 | en |
| dc.description.tableofcontents | 謝詞 i
論文摘要 ii THESIS ABSTRACT iii Table of Contents v List of Tables vi List of Figures vii Chapter 1 Introduction 1 Chapter 2 Related Work 3 Chapter 3 Method 5 3.1 Data Preprocessing 5 3.2 PCA-based Bipolar Feature Identification 5 3.3 Sparseness of Textual Units 10 3.3.1 Weighted Correlation Coefficient 12 3.3.2 Off-topic Block Elimination 13 3.4 Activeness Timeline of Bipolar Groups 14 Chapter 4 Performance Evaluation 15 4.1 Data Corpus and Evaluation Metric 15 4.2 Effect of System Components 18 4.3 Comparison with Other Methods 27 4.4 Activeness Timeline Evaluations 30 Chapter 5 Conclusion 34 References 36 | |
| dc.language.iso | en | |
| dc.subject | 時間趨勢圖 | zh_TW |
| dc.subject | 文件探勘 | zh_TW |
| dc.subject | 意見探勘 | zh_TW |
| dc.subject | 分群 | zh_TW |
| dc.subject | 極性人名辨識 | zh_TW |
| dc.subject | 主題文章 | zh_TW |
| dc.subject | 主成份分析法 | zh_TW |
| dc.subject | topic documents | en |
| dc.subject | activeness timeline graph | en |
| dc.subject | principal component analysis | en |
| dc.subject | text mining | en |
| dc.subject | opinion mining | en |
| dc.subject | clustering | en |
| dc.subject | bipolar person name identification | en |
| dc.title | 應用主成份分析法辨識主題文件中極性人名 | zh_TW |
| dc.title | Bipolar Person Name Identification of Topic Documents Using Principal Component Analysis | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 98-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳孟彰(Meng-Chang Chen),陳銘憲(Ming-Syan Chen) | |
| dc.subject.keyword | 文件探勘,意見探勘,分群,極性人名辨識,主題文章,主成份分析法,時間趨勢圖, | zh_TW |
| dc.subject.keyword | text mining,opinion mining,clustering,bipolar person name identification,topic documents,principal component analysis,activeness timeline graph, | en |
| dc.relation.page | 37 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2010-07-22 | |
| dc.contributor.author-college | 管理學院 | zh_TW |
| dc.contributor.author-dept | 資訊管理學研究所 | zh_TW |
| 顯示於系所單位: | 資訊管理學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-99-1.pdf 未授權公開取用 | 708.31 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
