Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57846
Title: | 中文雙音節複合詞組成詞素之語意分群 Semantic Clustering for the Constituent Morphemes of Chinese Disyllabic Compounds |
Authors: | Chia-Ling Lee 李嘉玲 |
Advisor: | 許永真(Jane Yung-jen Hsu) |
Keyword: | 詞素覺識,語意分群,自然語言處理,計算語言, morphological awareness,semantic clustering,natural language processing,computational linguistics, |
Publication Year : | 2014 |
Degree: | 碩士 |
Abstract: | 「詞素」是構成「詞」的基本單位,為語言中具有意義的最小單位。「詞素覺識」指的是察覺及操弄語詞內在結構關係之能力,許多語言學家認為其與閱讀理解有很大的關係。中文詞彙中有76%的詞為複合詞(compound),而一個中文字在不同的語詞中可能有不同的意思。給定一組享有同一目標字之構詞相關中文詞彙,本研究主要目標即是透過計算語言學的技術,依據目標字在這些詞彙中的字義作分群。以「商」為例,我們能將{商店、商品、商代、商朝}根據「商」的字義,分成兩群:{商店、商品}及{商代、商朝}。第一群的「商」與商業相關,而在第二群裡,「商」則是指一個朝代。我們的方法考慮上下文、語意、句法、語彙、統計等因素,並將這些因素整合。為了比較結果,我們請數位研究生與小學生做同樣的詞彙分群。實驗結果顯示,我們提出的整合方法,與小學生組達同一程度的表現水準。 Morphological awareness is thought by many linguists to strongly affect reading development in children. A Chinese character embedded in different compound words may carry different meanings. In this work, we aim at semantical clustering of a given family of morphologically related Chinese words. For example, '商店(store)', '商品(commodity)', '商代(Shang period)', and '商朝(Shang Dynasty)' can form two clusters: {'商店', '商品'} and {'商代', '商朝'}. In terms of meanings of the character '商/shang1/', the former subgroup conveys concepts about a Chinese dynasty, and the latter carries information about commerce. We aggregate computational linguistics methods, taking contextual, semantic, syntactic, lexical, and statistical factors into consideration. To contrast these results, in human experiment, we recruit adults and children to perform the clustering task. Experimental results indicate that our ensemble model achieves a similar level of performance as children. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57846 |
Fulltext Rights: | 有償授權 |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-103-1.pdf Restricted Access | 2.54 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.