以非監督學習之語義文件向量進行細緻多層面情緒分析

Hao-Ming Fu; 傅浩明

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/691

Title:	以非監督學習之語義文件向量進行細緻多層面情緒分析 Learning Unsupervised Semantic Document Representation for Fine-grained Aspect-based Sentiment Analysis
Authors:	Hao-Ming Fu 傅浩明
Advisor:	鄭卜壬(Pu-Jen Cheng)
Keyword:	文件向量,句子向量,非監督學習,情緒分析,語義學習,文字分類, Document representation,Sentence embedding,Unsupervised learning,Sentiment analysis,Semantic learning,Text classification,
Publication Year :	2019
Degree:	碩士
Abstract:	文件的向量表達方式在自然語言處理的許多應用上扮演核心角色。尤其，以非監督學習所得到的一般性向量表達在這些應用中更是一大助益。在實務上，情緒分析是一個縱使困難，卻被認為非常語意層面的的應用，也因此常被用來當作檢測向量品質的工具。目前以非監督方式學習文件向量的方法主要可分為以下兩類:序列式的，他們直接把字彙間的排列順序納入考慮，以及非序列式的，他們不直接考慮字彙間的順序。然而，他們各自都有各自的問題仍待解決。在這篇論文中，我們提出一個模型，可以同時解決這兩種主要方法所面臨的難處。實驗證明我們所提出的方法在常見的情緒分析和同時考量多層面的細緻情緒分析上，都遠遠優於現有的最佳方法。 Document representation is the core of many NLP tasks on machine understanding. A general representation learned in an unsupervised manner reserves generality and can be used for various applications. In practice, sentiment analysis (SA) has been a challenging task that is regarded to be deeply semantic-related and is often used to assess general representations. Existing methods on unsupervised document representation learning can be separated into two families: sequential ones, which explicitly take the ordering of words into consideration, and non-sequential ones, which do not explicitly do so. However, both of them suffer from their own weaknesses. In this paper, we propose a model that overcomes difficulties encountered by both families of methods. Experiments show that our model outperforms state-of-the-art methods on popular SA datasets and a fine-grained aspect-based SA by a large margin.
URI:	http://tdr.lib.ntu.edu.tw/handle/123456789/691
DOI:	10.6342/NTU201902410
Fulltext Rights:	同意授權(全球公開)
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-108-1.pdf	1.13 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets