Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 文學院
  3. 語言學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48937
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor謝舒凱(Shu-Kai Hsieh)
dc.contributor.authorYi-An Wuen
dc.contributor.author吳怡安zh_TW
dc.date.accessioned2021-06-15T11:11:54Z-
dc.date.available2017-08-25
dc.date.copyright2016-08-25
dc.date.issued2016
dc.date.submitted2016-08-22
dc.identifier.citation[1] Mohamed Aly. “Survey on multiclass classification methods”. In: Neural Netw (2005), pp. 1–9.
[2] Stephen D Bay. “Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets.” In: ICML. Vol. 98. Citeseer. 1998, pp. 37–45.
[3] Yoshua Bengio et al. “A neural probabilistic language model”. In: The Journal of Machine Learning Research 3 (2003), pp. 1137–1155.
[4] Peter F Brown et al. “Word-sense disambiguation using statistical methods”. In: Proceedings of the 29th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics. 1991, pp. 264–270.
[5] Eugene Charniak and Mark Johnson. “Coarse-to-fine n-best parsing and MaxEnt discriminative reranking”. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. 2005, pp. 173–180.
[6] Danqi Chen and Christopher D Manning. “A fast and accurate dependency parser using neural networks”. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Vol. 1. 2014, pp. 740–750.
[7] Hsin-Hsi Chen and Chi-Ching Lin. “Sense-tagging Chinese corpus”. In: Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics Volume 12. Association for Computational Linguistics. 2000, pp. 7–14.
[8] Keh-Jiann Chen et al. “Sinica corpus: Design methodology for balanced corpora”. In: Language 167 (1996), p. 176.
[9] Ronan Collobert et al. “Natural language processing (almost) from scratch”. In: The Journal of Machine Learning Research 12 (2011), pp. 2493–2537.
[10] Corinna Cortes and Vladimir Vapnik. “Support-vector networks”. In: Machine learning 20.3 (1995), pp. 273–297.
[11] Hoa Trang Dang et al. “Simple features for Chinese word sense disambiguation”. In: Proceedings of the 19th international conference on Computational linguistics Volume 1. Association for Computational Linguistics. 2002, pp. 1–7.
[12] Xiangyu Duan, Jun Zhao, and Bo Xu. “Word Sense Disambiguation through Sememe Labeling.” In: IJCAI. 2007, pp. 1594–1599.
[13] Philip Edmonds and Scott Cotton. “SENSEVAL-2: overview”. In: The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics. 2001, pp. 1–5.
[14] Gerard Escudero, Lluı́s Marquez, and German Rigau. “Naive Bayes and exemplar-based approaches to word sense disambiguation revisited”. In: arXiv preprint cs/0007011 (2000).
[15] William A Gale, Kenneth W Church, and David Yarowsky. “A method for disambiguating word senses in a large corpus”. In: Computers and the Humanities 26.5-6 (1992), pp. 415–439.
[16] Michel Galley and Kathleen McKeown. “Improving word sense disambiguation in lexical chaining”. In: IJCAI. Vol. 3. 2003, pp. 1486–1488.
[17] Zellig S Harris. “Distributional structure.” In: Word (1954).
[18] Marti Hearst. “Noun homograph disambiguation using local context in large text
corpora”. In: Using Corpora (1991), pp. 185–188.
[19] Eric H Huang et al. “Improving word representations via global context and multiple word prototypes”. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics. 2012, pp. 873–882.
[20] Adam Kilgarri. “Senseval: An exercise in evaluating word sense disambiguation programs”. In: Proc. of the first international conference on language resources and evaluation. Citeseer. 1998, pp. 581–588.
[21] Adam Kilgarriff. “95% Replicability for Manual Word Sense Tagging.” In: EACL. Citeseer. 1999, pp. 277–278.
[22] Adam Kilgarriff and Martha Palmer. “Introduction to the special issue on SENSEVAL”. In: Computers and the Humanities 34.1-2 (2000), pp. 1–13.
[23] Adam Kilgarriff and Joseph Rosenzweig. “Framework and results for English SENSEVAL”. In: Computers and the Humanities 34.1-2 (2000), pp. 15–48.
[24] Sang-Bum Kim, Hee-Cheol Seo, and Hae-Chang Rim. “Information retrieval using word senses: root sense tagging approach”. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM. 2004, pp. 258–265.
[25] Claudia Leacock and Martin Chodorow. “Combining local context and WordNet similarity for word sense identification”. In: WordNet: An electronic lexical database 49.2 (1998), pp. 265–283.
[26] Claudia Leacock, George A Miller, and Martin Chodorow. “Using corpus statistics and WordNet relations for sense identification”. In: Computational Linguistics 24.1 (1998), pp. 147–165.
[27] Yoong Keok Lee, Hwee Tou Ng, and Tee Kiah Chia. “Supervised word sense disambiguation with support vector machines and multiple knowledge sources”. In: Senseval-3: third international workshop on the evaluation of systems for the semantic analysis of text. 2004, pp. 137–140.
[28] Alessandro Lenci. “Distributional semantics in linguistic and cognitive research”. In: From context to meaning: Distributional models of the lexicon in linguistics and cognitive science, special issue of the Italian Journal of Linguistics 20.1 (2008), pp. 1–31.
[29] Michael Lesk. “Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone”. In: Proceedings of the 5th annual international conference on Systems documentation. ACM. 1986, pp. 24– 26.
[30] Jonas Littorin and Nina Taslaman. “Supervised Word Sense Disambiguation”. In: ().
[31] Andrew L Maas and Andrew Y Ng. “A probabilistic model for semantic word vectors”. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2010.
[32] Margaret Masterman. “The thesaurus in syntax and semantics”. In: Mechanical Translation 4.1-2 (1957), pp. 35–43.
[33] Rada Mihalcea. “Using Wikipedia for Automatic Word Sense Disambiguation.” In: HLT-NAACL. 2007, pp. 196–203.
[34] Rada Mihalcea, Timothy Anatolievich Chklovski, and Adam Kilgarriff. “The Senseval-3 English lexical sample task”. In: Association for Computational Linguistics. 2004.
[35] Tomas Mikolov et al. “Distributed representations of words and phrases and their compositionality”. In: Advances in neural information processing systems. 2013, pp. 3111–3119.
[36] Tomas Mikolov et al. “Efficient estimation of word representations in vector space”. In: arXiv preprint arXiv:1301.3781 (2013).
[37] George A Miller et al. “A semantic concordance”. In: Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics. 1993, pp. 303–308.
[38] George A Miller et al. “Using a semantic concordance for sense identification”. In: Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics. 1994, pp. 240–243.
[39] Roberto Navigli. “Word sense disambiguation: A survey”. In: ACM Computing Surveys (CSUR) 41.2 (2009), p. 10.
[40] Hwee Tou Ng, Bin Wang, and Yee Seng Chan. “Exploiting parallel texts for word sense disambiguation: An empirical study”. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1. Association for Computational Linguistics. 2003, pp. 455–462.
[41] Ian Niles and Adam Pease. “Towards a standard upper ontology”. In: Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001. ACM. 2001, pp. 2–9.
[42] Zheng-Yu Niu, Dong-Hong Ji, and Chew-Lim Tan. “Optimizing feature set for chinese word sense disambiguation”. In: Proceedings of Senseval-3, Third International Workshop on Evaluating Word Sense Disambiguation Systems. 2004.
[43] Martha Palmer, Hoa Trang Dang, and Christiane Fellbaum. “Making fine-grained and coarse-grained sense distinctions, both manually and automatically”. In: Natural Language Engineering 13.02 (2007), pp. 137–163.
[44] Sameer S Pradhan et al. “SemEval-2007 task 17: English lexical sample, SRL and all words”. In: Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics. 2007, pp. 87–92.
[45] James Pustejovsky. “The generative lexicon”. In: Computational linguistics 17.4 (1991), pp. 409–441.
[46] Roy Rada et al. “Development and application of a metric on semantic nets”. In: Systems, Man and Cybernetics, IEEE Transactions on 19.1 (1989), pp. 17–30.
[47] Radim Řehůřek and Petr Sojka. “Software Framework for Topic Modelling with Large Corpora”. English. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. http://is.muni.cz/publication/884893/en. Valletta, Malta: ELRA, May 22, 2010, pp. 45–50.
[48] Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. “Polyglot: Distributed word representations for multilingual nlp”. In: arXiv preprint arXiv:1307.1662 (2013).
[49] Irina Rish. “An empirical study of the naive Bayes classifier”. In: IJCAI 2001 workshop on empirical methods in artificial intelligence. Vol. 3. 22. IBM New York. 2001, pp. 41–46.
[50] Hinrich Schutze. “Automatic word sense discrimination”. In: Computational linguistics 24.1 (1998), pp. 97–123.
[51] Hinrich Schutze. “Dimensions of meaning”. In: Supercomputing’92., Proceedings. IEEE. 1992, pp. 787–796.
[52] Marina Sokolova and Guy Lapalme. “A systematic analysis of performance measures for classification tasks”. In: Information Processing & Management 45.4 (2009), pp. 427–437.
[53] Michael Sussna. “Word sense disambiguation for free-text indexing using a massive semantic network”. In: Proceedings of the second international conference on Information and knowledge management. ACM. 1993, pp. 67–74.
[54] A. Trask, P. Michalak, and J. Liu. “sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings”. In: ArXiv e-prints (Nov. 2015). arXiv: 1511.06388 [cs.CL].
[55] Joseph Turian, Lev Ratinov, and Yoshua Bengio. “Word representations: a simple and general method for semi-supervised learning”. In: Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics. 2010, pp. 384–394.
[56] Warren Weaver. “Translation”. In: Machine translation of languages 14 (1955), pp. 15–23.
[57] David Yarowsky. “Unsupervised word sense disambiguation rivaling supervised methods”. In: Proceedings of the 33rd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics. 1995, pp. 189– 196.
[58] Zhi Zhong and Hwee Tou Ng. “Word Sense Disambiguation for All Words without Hard Labor.” In: IJCAI. 2009, pp. 1616–1622.
[59] 高佩如. “語境預測力對中文一詞多義處理歷程的影響: 文句閱讀的眼動研究”. In: (2011).
[60] 黃居仁 et al. “中文詞彙網路: 跨語言知識處理基礎架構的設計理念與實踐”. In: 中國語文 24.2 (2010), pp. 14–23.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48937-
dc.description.abstract本篇論文試圖建立以中文詞彙網路的詞義為基礎的詞義標記系統。 中文詞網提供完整的中文詞義區分與詞彙語意關係知識庫,為一重要 的詞彙語意表徵。建造詞義標記系統需要詞義消歧(Word Sense Disambiguation)的技術,這是一個在自然語言處理中古老但尚未解決的 問題。本篇論文對於這個問題使用的方法是監都式學習(Supervised Learning)和文字嵌入(Neural Word Embedding)在某些詞彙上進行實 驗,訓練的語料來自中研院平衡語料庫和 PTT 語料庫且經過人工標記 詞義。LOPE Text Analytics 是這個研究的實際應用,它為研究者提供詞 義標記結合其它語言分析模組的系統。zh_TW
dc.description.abstractThe aim of the thesis attempts to establish a sense tagger for the Chinese Wordnet, which is an important representations of lexical semantics and provides distinction of senses and lexical semantic relations. The construction of the sense tagger requires the techniques of Word Sense Disambiguation (WSD), which is an old but still unsolved problem in Natural Language Processing (NLP). The approaches of the study involve the supervised learning methods and the neural word embeddings for certain lexical samples. The training corpora are the human annotated texts from Sinica Corpus and PTT Corpus. The LOPE Text Analytics, which is the implementation of the study, provides several applications including sense tagger and other text analytic modules for researchers.en
dc.description.provenanceMade available in DSpace on 2021-06-15T11:11:54Z (GMT). No. of bitstreams: 1
ntu-105-R02142007-1.pdf: 1737083 bytes, checksum: 0a7ecebbb367d3a4be1d0cc6aeb62ea8 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iii
Abstract iv
1 Introduction 1
1.1 Task Description 2
1.2 Difficulties 3
1.3 Approaches 4
1.4 Evaluations 5
1.5 Structure of the Thesis 5
2 Literature Review 7
2.1 Supervised Methods 7
2.2 Semi-Supervised Methods 8
2.3 UnsupervisedMethods 9
2.4 Dictionary and Knowledge Based Methods 9
2.5 Neural Word Embeddings 9
2.6 Chinese WSD 10
3 Methodology 12
3.1 WordSelection 13
3.2 Training Corpora 14
3.3 Chinese Wordnet 15
3.4 Annotation 17
3.5 Algorithms 18
3.5.1 Naive Bayes 18
3.5.2 k-Nearest Neighbors 19
3.5.3 Support Vector Machine 20
3.5.4 Word2vec 21
3.6 Evaluations 22
3.7 Summary 25
4 Results 26
4.1 Comparing Different Algorithms 26
4.2 Learning Curves 31
4.3 Evaluations of the Performance 34
4.3.1 Baseline 34
4.3.2 Ceiling 34
4.3.3 Performance of Similar Works 35
5 Discussions 36
5.1 Metalinguistic Variables 36
5.1.1 Frequency 36
5.1.2 Number of Senses 37
5.1.3 Part of Speech 37
5.1.4 Distance between Senses 37
5.1.5 Difficulty 38
5.2 Linear Regression Model 39
5.3 Interpretation of the Mode 40
5.3.1 Used Sense 41
5.3.2 Average Distance 41
5.3.3 Difficulty 42
5.4 Summary 43
6 Applications 44
6.1 SenseMapping 45
6.2 LOPETextAnalytics 46
7 Conclusion 49
dc.language.isoen
dc.title中文詞彙網路的詞義消歧zh_TW
dc.titleWord Sense Disambiguation with Chinese Wordneten
dc.typeThesis
dc.date.schoolyear104-2
dc.description.degree碩士
dc.contributor.oralexamcommittee馬偉雲,高照明
dc.subject.keyword中文詞彙網路,詞義消歧,詞義標記,監都式學習,文字嵌入,zh_TW
dc.subject.keywordChinese Wordnet,Word Sense Disambiguation,Sense Tagging,Supervised Learning,Neural Word Embedding,en
dc.relation.page57
dc.identifier.doi10.6342/NTU201603333
dc.rights.note有償授權
dc.date.accepted2016-08-22
dc.contributor.author-college文學院zh_TW
dc.contributor.author-dept語言學研究所zh_TW
dc.date.embargo-terms2300-01-01
dc.date.embargo-lift2300-01-01-
Appears in Collections:語言學研究所

Files in This Item:
File SizeFormat 
ntu-105-1.pdf
  Restricted Access
1.7 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved