數種結合詞向量與字典資源之方法用於字義相似度測量

Hao Ke; 葛浩

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50159

標題:	數種結合詞向量與字典資源之方法用於字義相似度測量 Some approaches of combining word embedding and lexical resource for semantic relateness mesurement
作者:	Hao Ke 葛浩
指導教授:	陳信希(Hsin-Hsi Chen)
關鍵字:	語義關聯度,詞向量,WordNet,GloVe,Word2Vec, semantic relatedness,word embedding,WordNet,GloVe,Word2Vec,
出版年 :	2016
學位:	碩士
摘要:	本文提出三種不同的方法來處理計算語義關聯度的問題：一、去除或調整GloVe詞向量內之不正常維度來提高效能；二、利用WordNet的距離資訊與詞向量做線性組合；三、用詞向量以及十二個從WordNet擷取出來的資訊作為SVR的特徵做監督式學習。本文在六個評測基準資料集進行了實驗，以皮爾森相關係數與斯皮爾曼相關係數計算本文的方法產生之結果與正確標記之間的相關程度，並且與三個近期提出的計算語義關聯度方法做比較。實驗結果顯示，本文的方法在多組評測基準資料集上超越了以上三個近期提出的方法。 In this thesis, we propose three different approaches to measure the semantic relatedness: (1) Boost the performance of GloVe word embedding by removing ortransforming abnormal dimensions. (2) Linearly combines the path information extracted from WordNet and the word embedding. (3) Utilize word embedding and twelve linguisticinformation extracted from WordNet as features for support vector regression. We conduct our experiments on six benchmark data sets. The evaluation measurecomputes the Pearson and Spearman correlation between the output of our methods and the ground truth. We report our results together with three state-of-the-art approaches. Theexperimental results show that our methods outperform the state-of-the-art approaches in most of the benchmark data sets.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50159
DOI:	10.6342/NTU201601910
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 目前未授權公開取用	1.77 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。