中文詞彙網路的詞義消歧

Yi-An Wu; 吳怡安

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48937

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	謝舒凱(Shu-Kai Hsieh)
dc.contributor.author	Yi-An Wu	en
dc.contributor.author	吳怡安	zh_TW
dc.date.accessioned	2021-06-15T11:11:54Z	-
dc.date.available	2017-08-25
dc.date.copyright	2016-08-25
dc.date.issued	2016
dc.date.submitted	2016-08-22
dc.identifier.citation	[1] Mohamed Aly. “Survey on multiclass classification methods”. In: Neural Netw (2005), pp. 1–9. [2] Stephen D Bay. “Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets.” In: ICML. Vol. 98. Citeseer. 1998, pp. 37–45. [3] Yoshua Bengio et al. “A neural probabilistic language model”. In: The Journal of Machine Learning Research 3 (2003), pp. 1137–1155. [4] Peter F Brown et al. “Word-sense disambiguation using statistical methods”. In: Proceedings of the 29th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics. 1991, pp. 264–270. [5] Eugene Charniak and Mark Johnson. “Coarse-to-fine n-best parsing and MaxEnt discriminative reranking”. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. 2005, pp. 173–180. [6] Danqi Chen and Christopher D Manning. “A fast and accurate dependency parser using neural networks”. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Vol. 1. 2014, pp. 740–750. [7] Hsin-Hsi Chen and Chi-Ching Lin. “Sense-tagging Chinese corpus”. In: Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics Volume 12. Association for Computational Linguistics. 2000, pp. 7–14. [8] Keh-Jiann Chen et al. “Sinica corpus: Design methodology for balanced corpora”. In: Language 167 (1996), p. 176. [9] Ronan Collobert et al. “Natural language processing (almost) from scratch”. In: The Journal of Machine Learning Research 12 (2011), pp. 2493–2537. [10] Corinna Cortes and Vladimir Vapnik. “Support-vector networks”. In: Machine learning 20.3 (1995), pp. 273–297. [11] Hoa Trang Dang et al. “Simple features for Chinese word sense disambiguation”. In: Proceedings of the 19th international conference on Computational linguistics Volume 1. Association for Computational Linguistics. 2002, pp. 1–7. [12] Xiangyu Duan, Jun Zhao, and Bo Xu. “Word Sense Disambiguation through Sememe Labeling.” In: IJCAI. 2007, pp. 1594–1599. [13] Philip Edmonds and Scott Cotton. “SENSEVAL-2: overview”. In: The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics. 2001, pp. 1–5. [14] Gerard Escudero, Lluı́s Marquez, and German Rigau. “Naive Bayes and exemplar-based approaches to word sense disambiguation revisited”. In: arXiv preprint cs/0007011 (2000). [15] William A Gale, Kenneth W Church, and David Yarowsky. “A method for disambiguating word senses in a large corpus”. In: Computers and the Humanities 26.5-6 (1992), pp. 415–439. [16] Michel Galley and Kathleen McKeown. “Improving word sense disambiguation in lexical chaining”. In: IJCAI. Vol. 3. 2003, pp. 1486–1488. [17] Zellig S Harris. “Distributional structure.” In: Word (1954). [18] Marti Hearst. “Noun homograph disambiguation using local context in large text corpora”. In: Using Corpora (1991), pp. 185–188. [19] Eric H Huang et al. “Improving word representations via global context and multiple word prototypes”. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics. 2012, pp. 873–882. [20] Adam Kilgarri. “Senseval: An exercise in evaluating word sense disambiguation programs”. In: Proc. of the first international conference on language resources and evaluation. Citeseer. 1998, pp. 581–588. [21] Adam Kilgarriff. “95% Replicability for Manual Word Sense Tagging.” In: EACL. Citeseer. 1999, pp. 277–278. [22] Adam Kilgarriff and Martha Palmer. “Introduction to the special issue on SENSEVAL”. In: Computers and the Humanities 34.1-2 (2000), pp. 1–13. [23] Adam Kilgarriff and Joseph Rosenzweig. “Framework and results for English SENSEVAL”. In: Computers and the Humanities 34.1-2 (2000), pp. 15–48. [24] Sang-Bum Kim, Hee-Cheol Seo, and Hae-Chang Rim. “Information retrieval using word senses: root sense tagging approach”. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM. 2004, pp. 258–265. [25] Claudia Leacock and Martin Chodorow. “Combining local context and WordNet similarity for word sense identification”. In: WordNet: An electronic lexical database 49.2 (1998), pp. 265–283. [26] Claudia Leacock, George A Miller, and Martin Chodorow. “Using corpus statistics and WordNet relations for sense identification”. In: Computational Linguistics 24.1 (1998), pp. 147–165. [27] Yoong Keok Lee, Hwee Tou Ng, and Tee Kiah Chia. “Supervised word sense disambiguation with support vector machines and multiple knowledge sources”. In: Senseval-3: third international workshop on the evaluation of systems for the semantic analysis of text. 2004, pp. 137–140. [28] Alessandro Lenci. “Distributional semantics in linguistic and cognitive research”. In: From context to meaning: Distributional models of the lexicon in linguistics and cognitive science, special issue of the Italian Journal of Linguistics 20.1 (2008), pp. 1–31. [29] Michael Lesk. “Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone”. In: Proceedings of the 5th annual international conference on Systems documentation. ACM. 1986, pp. 24– 26. [30] Jonas Littorin and Nina Taslaman. “Supervised Word Sense Disambiguation”. In: (). [31] Andrew L Maas and Andrew Y Ng. “A probabilistic model for semantic word vectors”. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2010. [32] Margaret Masterman. “The thesaurus in syntax and semantics”. In: Mechanical Translation 4.1-2 (1957), pp. 35–43. [33] Rada Mihalcea. “Using Wikipedia for Automatic Word Sense Disambiguation.” In: HLT-NAACL. 2007, pp. 196–203. [34] Rada Mihalcea, Timothy Anatolievich Chklovski, and Adam Kilgarriff. “The Senseval-3 English lexical sample task”. In: Association for Computational Linguistics. 2004. [35] Tomas Mikolov et al. “Distributed representations of words and phrases and their compositionality”. In: Advances in neural information processing systems. 2013, pp. 3111–3119. [36] Tomas Mikolov et al. “Efficient estimation of word representations in vector space”. In: arXiv preprint arXiv:1301.3781 (2013). [37] George A Miller et al. “A semantic concordance”. In: Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics. 1993, pp. 303–308. [38] George A Miller et al. “Using a semantic concordance for sense identification”. In: Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics. 1994, pp. 240–243. [39] Roberto Navigli. “Word sense disambiguation: A survey”. In: ACM Computing Surveys (CSUR) 41.2 (2009), p. 10. [40] Hwee Tou Ng, Bin Wang, and Yee Seng Chan. “Exploiting parallel texts for word sense disambiguation: An empirical study”. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1. Association for Computational Linguistics. 2003, pp. 455–462. [41] Ian Niles and Adam Pease. “Towards a standard upper ontology”. In: Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001. ACM. 2001, pp. 2–9. [42] Zheng-Yu Niu, Dong-Hong Ji, and Chew-Lim Tan. “Optimizing feature set for chinese word sense disambiguation”. In: Proceedings of Senseval-3, Third International Workshop on Evaluating Word Sense Disambiguation Systems. 2004. [43] Martha Palmer, Hoa Trang Dang, and Christiane Fellbaum. “Making fine-grained and coarse-grained sense distinctions, both manually and automatically”. In: Natural Language Engineering 13.02 (2007), pp. 137–163. [44] Sameer S Pradhan et al. “SemEval-2007 task 17: English lexical sample, SRL and all words”. In: Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics. 2007, pp. 87–92. [45] James Pustejovsky. “The generative lexicon”. In: Computational linguistics 17.4 (1991), pp. 409–441. [46] Roy Rada et al. “Development and application of a metric on semantic nets”. In: Systems, Man and Cybernetics, IEEE Transactions on 19.1 (1989), pp. 17–30. [47] Radim Řehůřek and Petr Sojka. “Software Framework for Topic Modelling with Large Corpora”. English. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. http://is.muni.cz/publication/884893/en. Valletta, Malta: ELRA, May 22, 2010, pp. 45–50. [48] Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. “Polyglot: Distributed word representations for multilingual nlp”. In: arXiv preprint arXiv:1307.1662 (2013). [49] Irina Rish. “An empirical study of the naive Bayes classifier”. In: IJCAI 2001 workshop on empirical methods in artificial intelligence. Vol. 3. 22. IBM New York. 2001, pp. 41–46. [50] Hinrich Schutze. “Automatic word sense discrimination”. In: Computational linguistics 24.1 (1998), pp. 97–123. [51] Hinrich Schutze. “Dimensions of meaning”. In: Supercomputing’92., Proceedings. IEEE. 1992, pp. 787–796. [52] Marina Sokolova and Guy Lapalme. “A systematic analysis of performance measures for classification tasks”. In: Information Processing & Management 45.4 (2009), pp. 427–437. [53] Michael Sussna. “Word sense disambiguation for free-text indexing using a massive semantic network”. In: Proceedings of the second international conference on Information and knowledge management. ACM. 1993, pp. 67–74. [54] A. Trask, P. Michalak, and J. Liu. “sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings”. In: ArXiv e-prints (Nov. 2015). arXiv: 1511.06388 [cs.CL]. [55] Joseph Turian, Lev Ratinov, and Yoshua Bengio. “Word representations: a simple and general method for semi-supervised learning”. In: Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics. 2010, pp. 384–394. [56] Warren Weaver. “Translation”. In: Machine translation of languages 14 (1955), pp. 15–23. [57] David Yarowsky. “Unsupervised word sense disambiguation rivaling supervised methods”. In: Proceedings of the 33rd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics. 1995, pp. 189– 196. [58] Zhi Zhong and Hwee Tou Ng. “Word Sense Disambiguation for All Words without Hard Labor.” In: IJCAI. 2009, pp. 1616–1622. [59] 高佩如. “語境預測力對中文一詞多義處理歷程的影響: 文句閱讀的眼動研究”. In: (2011). [60] 黃居仁 et al. “中文詞彙網路: 跨語言知識處理基礎架構的設計理念與實踐”. In: 中國語文 24.2 (2010), pp. 14–23.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48937	-
dc.description.abstract	本篇論文試圖建立以中文詞彙網路的詞義為基礎的詞義標記系統。中文詞網提供完整的中文詞義區分與詞彙語意關係知識庫,為一重要的詞彙語意表徵。建造詞義標記系統需要詞義消歧(Word Sense Disambiguation)的技術,這是一個在自然語言處理中古老但尚未解決的問題。本篇論文對於這個問題使用的方法是監都式學習(Supervised Learning)和文字嵌入(Neural Word Embedding)在某些詞彙上進行實驗,訓練的語料來自中研院平衡語料庫和 PTT 語料庫且經過人工標記詞義。LOPE Text Analytics 是這個研究的實際應用,它為研究者提供詞義標記結合其它語言分析模組的系統。	zh_TW
dc.description.abstract	The aim of the thesis attempts to establish a sense tagger for the Chinese Wordnet, which is an important representations of lexical semantics and provides distinction of senses and lexical semantic relations. The construction of the sense tagger requires the techniques of Word Sense Disambiguation (WSD), which is an old but still unsolved problem in Natural Language Processing (NLP). The approaches of the study involve the supervised learning methods and the neural word embeddings for certain lexical samples. The training corpora are the human annotated texts from Sinica Corpus and PTT Corpus. The LOPE Text Analytics, which is the implementation of the study, provides several applications including sense tagger and other text analytic modules for researchers.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T11:11:54Z (GMT). No. of bitstreams: 1 ntu-105-R02142007-1.pdf: 1737083 bytes, checksum: 0a7ecebbb367d3a4be1d0cc6aeb62ea8 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv 1 Introduction 1 1.1 Task Description 2 1.2 Difficulties 3 1.3 Approaches 4 1.4 Evaluations 5 1.5 Structure of the Thesis 5 2 Literature Review 7 2.1 Supervised Methods 7 2.2 Semi-Supervised Methods 8 2.3 UnsupervisedMethods 9 2.4 Dictionary and Knowledge Based Methods 9 2.5 Neural Word Embeddings 9 2.6 Chinese WSD 10 3 Methodology 12 3.1 WordSelection 13 3.2 Training Corpora 14 3.3 Chinese Wordnet 15 3.4 Annotation 17 3.5 Algorithms 18 3.5.1 Naive Bayes 18 3.5.2 k-Nearest Neighbors 19 3.5.3 Support Vector Machine 20 3.5.4 Word2vec 21 3.6 Evaluations 22 3.7 Summary 25 4 Results 26 4.1 Comparing Different Algorithms 26 4.2 Learning Curves 31 4.3 Evaluations of the Performance 34 4.3.1 Baseline 34 4.3.2 Ceiling 34 4.3.3 Performance of Similar Works 35 5 Discussions 36 5.1 Metalinguistic Variables 36 5.1.1 Frequency 36 5.1.2 Number of Senses 37 5.1.3 Part of Speech 37 5.1.4 Distance between Senses 37 5.1.5 Difficulty 38 5.2 Linear Regression Model 39 5.3 Interpretation of the Mode 40 5.3.1 Used Sense 41 5.3.2 Average Distance 41 5.3.3 Difficulty 42 5.4 Summary 43 6 Applications 44 6.1 SenseMapping 45 6.2 LOPETextAnalytics 46 7 Conclusion 49
dc.language.iso	en
dc.subject	詞義消歧	zh_TW
dc.subject	中文詞彙網路	zh_TW
dc.subject	監都式學習	zh_TW
dc.subject	詞義標記	zh_TW
dc.subject	文字嵌入	zh_TW
dc.subject	Chinese Wordnet	en
dc.subject	Sense Tagging	en
dc.subject	Word Sense Disambiguation	en
dc.subject	Supervised Learning	en
dc.subject	Neural Word Embedding	en
dc.title	中文詞彙網路的詞義消歧	zh_TW
dc.title	Word Sense Disambiguation with Chinese Wordnet	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	馬偉雲,高照明
dc.subject.keyword	中文詞彙網路,詞義消歧,詞義標記,監都式學習,文字嵌入,	zh_TW
dc.subject.keyword	Chinese Wordnet,Word Sense Disambiguation,Sense Tagging,Supervised Learning,Neural Word Embedding,	en
dc.relation.page	57
dc.identifier.doi	10.6342/NTU201603333
dc.rights.note	有償授權
dc.date.accepted	2016-08-22
dc.contributor.author-college	文學院	zh_TW
dc.contributor.author-dept	語言學研究所	zh_TW
顯示於系所單位：	語言學研究所

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	1.7 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。