請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59974
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 林守德(Shou-De Lin) | |
dc.contributor.author | Yu Ran | en |
dc.contributor.author | 冉昱 | zh_TW |
dc.date.accessioned | 2021-06-16T09:48:15Z | - |
dc.date.available | 2017-02-16 | |
dc.date.copyright | 2017-02-16 | |
dc.date.issued | 2017 | |
dc.date.submitted | 2017-01-23 | |
dc.identifier.citation | [1] Pang B, Lee L. Opinion mining and sentiment analysis[J]. Foundations and TrendsR in Information Retrieval, 2008, 2(1–2): 1-135.
[2] Liu B. Sentiment analysis and opinion mining[J]. Synthesis lectures on human language technologies, 2012, 5(1): 1-167. [3] Mullen T, Collier N. Sentiment Analysis using Support Vector Machines with Diverse Information Sources[C]//EMNLP. 2004, 4: 412-418. [4] Bennett K, Demiriz A. Semi-supervised support vector machines[C]//NIPS. 1998, 11: 368-374. [5] Zhu X. Semi-supervised learning literature survey[J]. 2005. [6] Zhu X, Goldberg A B. Introduction to semi-supervised learning[J]. Synthesis lectures on artificial intelligence and machine learning, 2009, 3(1): 1-130. [7] Semi-Supervised Learning with Ladder Networks. Advances in Neural Information Processing Systems. 2015. [8] Pezeshki, Mohammad, et al. “Deconstructing the Ladder Network Architecture.”arXiv preprint arXiv:1511.06430 (2015). [9] Rasmus A, Valpola H, Raiko T. Lateral connections in denoising autoencoders support supervised learning[J]. arXiv preprint arXiv:1504.08215, 2015. [10] Valpola H. From neural PCA to deep unsupervised learning[J]. Advances in Independent Component Analysis and Learning Machines, 2015: 143-171. [11] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013. [12] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems. 2013: 3111-3119. [13] Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[C]//ICML. 2014, 14: 1188-1196. [14] Chen Wei-Ming. Two-side Feature Clustering Using Auxiliary Vector for Improving Stance Classification on Chinese Newspaper. NTU 2016 thesis paper. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59974 | - |
dc.description.abstract | 本論文主要基於先前收集的爭議性新聞語料,意圖發展一智慧程式,分辨中文爭議性新聞之立場。本問題難點主要在於標記語料量較少,模型難以學到足夠的知識。對於此問題,韋銘學長主要對特征進行劃分,特征集群達到特征降維進而提升準確率,他主要使用了監督管理方法。本文目標主要從如何完全利用無標記信息角度和使用深度學習表示特征的方法出發,最終超越韋銘學長方法。首先利用文檔向量作為文章特征,並且與普通字特征和依賴特征作對比;然後利用半監督學習方法,主要使用自學習模型和梯子網絡。我們的自學習模型在話題二,梯子網絡在其他三個話題上超過了韋銘學長的方法。 | zh_TW |
dc.description.abstract | We aim at developing an intelligent program to classify the stance on the Chinese news article on several controversial topics based on the former crawled data. The difficulty in this problem is the insufficient labeled news so that the model cannot learn enough knowledge. Wei-Ming mainly focus on the feature division, feature clustering to reduct the feature dimension and get higher accuracy with supervised method. We
aimed at how to make full use of unlabeled data and use deep learning representation vector as feature to get the result beyond the Wei-Ming’s method. We first use paragraph vector as news’ feature and compare them with word feature and dependency feature, then we use the semi-supervised method, that is self-learning and ladder network with paragraph vector feature. We get the better result in topic 2 with self-learning and other 3 topics beyond the Wei-Ming’s method. | en |
dc.description.provenance | Made available in DSpace on 2021-06-16T09:48:15Z (GMT). No. of bitstreams: 1 ntu-106-R03922142-1.pdf: 3718461 bytes, checksum: 48c4c3f60ef9097d8596059744e45723 (MD5) Previous issue date: 2017 | en |
dc.description.tableofcontents | 口試委員會審定書 ........................ #
誌謝..................................... i 中文摘要 ............................... ii ABSTRACT .............................. iii CONTENTS .............................. iv Chapter 1 Introduction ................. 1 Chapter 2 Related Work ................. 3 2.1 Traditional Sentiment Analysis ............ 3 2.2 Senior’s Method .......................... 3 2.3 Semi-supervised Method .................... 5 Chapter 3 Data Preprocessing and Feature Extraction ........................................ 6 3.1 Data Cleaning and Processing .............. 6 3.2 Feature Extraction ........................ 6 Chapter 4 Methodology ................. 10 4.1 Supervised Method ......................... 10 4.2 Semi-supervised Method .................... 10 4.2.1 Self-learning ........................... 10 4.2.2 Denoising Autoencoder ................... 11 4.2.3 Ladder Network .......................... 12 Chapter 5 Experiments ................. 17 5.1 Dataset... ................................ 17 5.1.1 Chinese Newspaper Dataset ............... 17 5.2 Evaluation Methods ........................ 17 5.3 Comparing Paragraph Vector Feature and Word-based Feature and Dependency Feature ........................ 18 5.4 Comparing State-of-the-art Semi-supervised Method ............................................... 19 Chapter 6 Conclusion and Future Work ....................................... 21 REFERENCE ............................. 22 | |
dc.language.iso | en | |
dc.title | 半監督學習模型以改善標籤數稀少的中文新聞的立場偵測分類 | zh_TW |
dc.title | Semi-supervised method for Improving Stance Classification on Insufficient Labeled Chinese Newspaper | en |
dc.type | Thesis | |
dc.date.schoolyear | 105-1 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 陳信希(Hsin-Hsi Chen),鄭卜壬(Pu-Jen Cheng) | |
dc.subject.keyword | 立場偵測,半監督學習,梯子網絡,深度學習, | zh_TW |
dc.subject.keyword | Stance Classification,Semi-supervised Learning,Ladder Network,Deep Learning, | en |
dc.relation.page | 23 | |
dc.identifier.doi | 10.6342/NTU201700163 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2017-01-23 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-106-1.pdf 目前未授權公開取用 | 3.63 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。