使用輔助向量的雙邊特徵分群以改善中文新聞的立場偵測分類

Wei-Ming Chen; 陳韋銘

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3984

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林守德(Shou-De Lin)
dc.contributor.author	Wei-Ming Chen	en
dc.contributor.author	陳韋銘	zh_TW
dc.date.accessioned	2021-05-13T08:39:44Z	-
dc.date.available	2016-03-08
dc.date.available	2021-05-13T08:39:44Z	-
dc.date.copyright	2016-03-08
dc.date.issued	2016
dc.date.submitted	2016-02-05
dc.identifier.citation	[1] Hasan, Kazi Saidul and Ng, Vincent. 'Why are You Taking this Stance? Identifying and Classifying Reasons in Ideological Debates,' EMNLP, 2014. [2] Somasundaran, Swapna and Wiebe, Janyce. 'Recognizing Stances in Online Debates,' ACL/IJCNLP, 2009. [3] Swapna Somasundaran and Janyce Wiebe. 2010. 'Recognizing stances in ideological on-line debates,' NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text (CAAGET '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 116-124. [4] Pranav Anand, Marilyn Walker, Rob Abbott, Jean E. Fox Tree, Robeson Bowmani, and Michael Minor. 2011. 'Cats rule and dogs drool!: classifying stance in online debate,' 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA '11). Association for Computational Linguistics, Stroudsburg, PA, USA, 1-9. [5] Walker, Marilyn A., Anand, Pranav, Abbott, Rob and Grant, Ricky. 'Stance Classification using Dialogic Properties of Persuasion,' HLT-NAACL, 2012. [6] Sridhar, Dhanya and Getoor, Lise and Walker, Marilyn. 'Collective Stance Classification of Posts in Online Debate Forums'. ACL Joint Workshop on Social Dynamics and Personal Attributes in Social Media, 2014. [7] Kazi Saidul Hasan and Vincent Ng. 'Extra-Linguistic Constraints on Stance Recognition in Ideological Debates,' the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 816-821, 2013. [8] Hasan, Kazi Saidul and Ng, Vincent. 'Frame Semantics for Stance Classification,' CoNLL, 2013. [9] Thomas, Matt, Pang, Bo and Lee, Lillian. 'Get out the vote: Determining support or opposition from Congressional floor-debate transcripts,' EMNLP, 2006 [10] A. Yessenalina, Y. Yue, and C. Cardie, 'Multi-level Structured Models for Document-level Sentiment Classification', EMNLP, 2010 [11] Balahur, Alexandra, Kozareva, Zornitsa and Montoyo, Andres. 'Determining the Polarity and Source of Opinions Expressed in Political Debates,' CICLing, 2009. [12] Murakami, Akiko and Raymond, Rudy. 'Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions,' COLING (Posters), 2010. [13] Amita Misra, Marilyn A. Walker 'Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue,' SIGDIAL, 2013 [14] Qiu, Minghui, Yang, Liu and Jiang, Jing. 'Modeling interaction features for debate side clustering,' CIKM, 2013. [15] Krippendorff, K. (2011). 'Computing Krippendorff's Alpha-Reliability,' Retrieved from http://repository.upenn.edu/asc_papers/43 [16] Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher D. Manning. 2009. “Discriminative Reordering with Chinese Grammatical Relations Features,” Third Workshop on Syntax and Structure in Statistical Translation. [17] Ku, Lun-Wei, Lee, Li-Ying and Chen, Hsin-Hsi. 'Opinion extraction, summarization and tracking in news and blog corpora,' AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, 2006. [18] Ku, Lun-Wei, Wu , Tung-Ho, Lee, Li-Ying and Chen, Hsin-Hsi. 'Construction of an Evaluation Corpus for Opinion Extraction,' NTCIR-5 Workshop Meeting, Tokyo, Japan, 2005. [19] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 'Efficient Estimation of Word Representations in Vector Space,' In Proceedings of Workshop at ICLR, 2013. [20] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 'Distributed Representations of Words and Phrases and their Compositionality,' In Proceedings of NIPS, 2013. [21] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 'Linguistic Regularities in Continuous Space Word Representations,' In Proceedings of NAACL HLT, 2013.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3984	-
dc.description.abstract	為了紓解媒體偏頗以及閱聽者選擇性偏好的現象，本篇論文專注於發展一智慧程式，用以分辨中文爭議性議題新聞之立場。我們提出一個簡單且有效率的方法，能夠考量無標記新聞資料庫的資訊、以及訓練資料之資訊，以合併相似的特徵。在我們提出的方法中，特徵會先根據初始訓練過程被分為兩邊，接著使用word2vec工具為每一個特徵產生輔助向量，最後使用高速的社群偵測演算法將意義上相近的特徵合併。實驗結果顯示，在大多數的情況下，我們提出的解決方案比直接使用原始特徵、以及使用常見的降維演算法還要好。	zh_TW
dc.description.abstract	In order to relieve media bias problem and selective preference problem, we aim at developing an intelligent system to classify the stance of Chinese news article on several controversial topics. We proposed a simple and efficient approach which can incorporate the information of unlabeled news corpus and the information of training data to merge similar features. In our approach, features were divided into two sides according to initial training process, and word2vec tool was utilized to produce auxiliary vectors for each feature. Finally, fast community detection algorithm was applied for clustering similar features. Experimental results show that our approach outperforms raw features and common dimensionality reduction techniques in most cases.	en
dc.description.provenance	Made available in DSpace on 2021-05-13T08:39:44Z (GMT). No. of bitstreams: 1 ntu-105-R02922010-1.pdf: 1043283 bytes, checksum: db6f57b540faf34c37a8ea77849160fb (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Motivation and Overview 1 1.2 Problem Formulation 2 Chapter 2 Related Work 3 Chapter 3 Dataset 5 3.1 Data Collection 5 3.2 Data Annotation 6 3.3 Data Observation 7 Chapter 4 Methodology 9 4.1 Feature Extraction 9 4.1.1 Word-based Feature 9 4.1.2 Dependency-based Feature 10 4.2 Dimensionality Reduction Techniques 13 4.2.1 Feature selection based approach 13 4.2.2 Low rank approximation 14 4.3 Our Proposed Approach 15 Step 1. Divide features into two sides 16 Step 2. Generate auxiliary vector for each feature 16 Step 3. Building feature graph by calculating similarity between features 18 Step 4. Clustering and merging features by community detection algorithm 18 Chapter 5 Experiments 19 5.1 Experimental Settings 19 5.2 Experimental Results 19 5.2.1 Performance of type of feature and merged feature 20 5.2.2 Performance of direct feature clustering and our proposed approach 22 5.2.3 Performance of baseline approaches and our proposed approach 25 5.3 Result Analysis 27 5.3.1 Sensitivity of the threshold while building feature-to-feature graph 27 Chapter 6 Conclusion and Future Work 29 REFERENCE 30
dc.language.iso	en
dc.subject	自然語言處理	zh_TW
dc.subject	立場偵測	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	中文新聞立場偵測	zh_TW
dc.subject	特徵合併	zh_TW
dc.subject	natural language processing	en
dc.subject	machine learning	en
dc.subject	feature clustering	en
dc.subject	stance classification on Chinese newspaper	en
dc.subject	stance classification	en
dc.title	使用輔助向量的雙邊特徵分群以改善中文新聞的立場偵測分類	zh_TW
dc.title	Two-side Feature Clustering Using Auxiliary Vector for Improving Stance Classification on Chinese Newspaper	en
dc.type	Thesis
dc.date.schoolyear	104-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	古倫維(Lun-Wei Ku),李政德(Cheng-Te Li)
dc.subject.keyword	立場偵測,中文新聞立場偵測,特徵合併,自然語言處理,機器學習,	zh_TW
dc.subject.keyword	stance classification,stance classification on Chinese newspaper,feature clustering,natural language processing,machine learning,	en
dc.relation.page	32
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2016-02-06
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf	1.02 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。