利用內文及社群資訊進行關鍵字之階層分類

Chien-Pang Liu; 劉建邦

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41603

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林守德(Shou-De Lin)
dc.contributor.author	Chien-Pang Liu	en
dc.contributor.author	劉建邦	zh_TW
dc.date.accessioned	2021-06-15T00:24:29Z	-
dc.date.available	2009-02-03
dc.date.copyright	2009-02-03
dc.date.issued	2009
dc.date.submitted	2009-01-23
dc.identifier.citation	1.Heflin, J. and J. Hendler, A Portrait of the Semantic Web in Action. IEEE Intelligent Systems, 2001. 16(2): p. 54-59. 2.Berners-Lee, T., J. Hendler, and O. Lassila, The Semantic Web, in Scientific American. 2001. 3.Sanderson, M. and B. Croft, Deriving concept hierarchies from text, in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999, ACM: Berkeley, California, United States. 4.ACM. The 1998 ACM Computing Classification System. 1998; Available from: http://www.acm.org/about/class/1998/. 5.Dumais, S. and H. Chen, Hierarchical classification of Web content, in Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. 2000, ACM: Athens, Greece. 6.Wibowo, W. and H.E. Williams, Simple and accurate feature selection for hierarchical categorisation, in Proceedings of the 2002 ACM symposium on Document engineering. 2002, ACM: McLean, Virginia, USA. 7.Doan, A., P. Domingos, and A.Y. Halevy, Reconciling schemas of disparate data sources: a machine-learning approach. SIGMOD Rec., 2001. 30(2): p. 509-520. 8.Do, H. and E. Rahm, COMA - A System for Flexible Combination of Schema Matching Approaches. 2002. 9.Ehrig, M. and Y. Sure, Ontology Mapping - An Integrated Approach, in The Semantic Web: Research and Applications. 2004. p. 76-91. 10.Church, K.W. and P. Hanks, Word association norms, mutual information, and lexicography, in Proceedings of the 27th annual meeting on Association for Computational Linguistics. 1989, Association for Computational Linguistics: Vancouver, British Columbia, Canada. 11.White, S. and P. Smyth, Algorithms for estimating relative importance in networks, in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. 2003, ACM: Washington, D.C. 12.Russell, S.J. and P. Norvig, Artificial Intelligence: A Modern Approach. 2003: Pearson Education. 1132. 13.Cimiano, P., et al. Gimme' the context: context-driven automatic semantic annotation with C-PANKOW. in WWW '05: Proceedings of the 14th international conference on World Wide Web. 2005: ACM Press. 14.McDowell, L.K. and M. Cafarella, Ontology-driven, unsupervised instance population. Web Semant., 2008. 6(3): p. 218-236. 15.National Central Library, R.O.C. Electronic theses and dissertations system. Available from: http://etds.ncl.edu.tw. 16.Coulter, N., ACM'S computing classification system reflects changing times. Commun. ACM, 1997. 40(12): p. 111-112. 17.Sun Microsystems, I. Java SE 6. Available from: http://java.sun.com/javase/6/. 18.LingPipe. LingPipe. Available from: http://alias-i.com/lingpipe/. 19.Eclipse. The Eclipse Project. 2008; Available from: http://www.eclipse.org/. 20.JUNG. Java Universal Network/Graph Framework. 2008; Available from: http://jung.sourceforge.net/. 21.Sanchez, D. and A. Moreno, Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl. Eng., 2008. 64(3): p. 600-623.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41603	-
dc.description.abstract	本論文的研究目的，在於建立一個非監督式的階層分類系統，此系統可將特定的專有名詞，適當地歸類在本體的分類架構上。所用的研究方法，與其他方法不同點在於：藉由利用結合內文與社群資訊所建置的字彙關連模型，可較單一資訊產生的關連模型，達到較強的分類效果。此外，我們利用本體的階層結構，提出以路徑為主的新穎分類方法。本研究係以三種不同的字彙關連模型，計算出專有名詞與類別的相似度；三種模型包括(1)基於相互資訊的內文模型、(2)具有共同社群屬性的靜態社群模型及(3)基於社群網路與頁排名演算法的動態社群模型。所用的階層分類演算法，是利用本體結構與字彙關連模型來預測專有名詞的類別。本研究的實驗，採行計算機器協會的文獻分類系統進行驗證，結果顯示，所提出的分類演算法，以及結合內文與社群資訊的字彙關連模型，均可有效地提升專有名詞分類的正確率。	zh_TW
dc.description.abstract	The objective of this thesis is to develop an unsupervised hierarchical classification system in which a given proper noun is classified into an appropriate category of a designated ontology. Different from other approaches, our methods exploit both content and social information to show that combining weaker similarity measures could produce a stronger one. To take the hierarchical information into account, we also propose a novel path-based classification strategy. In our work, similarities of proper nouns and categories are captured using three different models: a content-based model using pointwise mutual information; a static social model based on social similarity, and a dynamic social model through exploiting the PageRank algorithm on a social network. Our hierarchical classification algorithms exploit both the ontology structure and similarity measures to identify the category of a given proper noun. The experimental results on ACM Computing Classification System show that our proposed classification algorithm, when used combined similarity measure, can improve significantly the effectiveness of the proper noun classification.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T00:24:29Z (GMT). No. of bitstreams: 1 ntu-98-R95922137-1.pdf: 957252 bytes, checksum: 4ef02a8672259c6628d999029f9e83b7 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	Acknowledgements ii 摘要 iii ABSTRACT iv List of Figures vii List of Tables viii Chapter 1: Introduction 1 1.1 Motivation 2 1.2 Contributions 3 1.3 Thesis Outline 3 Chapter 2: Related Work 4 2.1 Hierarchical Classification 4 2.2 Combination of Similarity Measures 5 Chapter 3: Methodology 6 3.1 Problem Definition, Methodology Outline, and an Example 6 3.2 Similarity Measures 11 3.2.1 Content-Based Similarity Measure 11 3.2.2 Social-Based Similarity Measure 13 3.3 Hierarchical Classification 15 3.3.1 Baseline Method 15 3.3.2 Tree-Based Method 17 3.3.3 Path-Based Method 19 3.3.4 Smoothing Method of Hierarchical Classification 22 Chapter 4: Experiments and Results 24 4.1 Evaluation Method 24 4.2 Experimental Dataset 25 4.3 Testing Data and Environment Setup 28 4.4 Result of Social Score Method 29 4.5 Result of Different Classification Algorithm 30 4.6 Result of Different Similarity Measure 31 4.7 Results of Smoothing Strategy 33 4.8 Discussion 34 Chapter 5: Conclusions and Future Work 35 References 36
dc.language.iso	en
dc.title	利用內文及社群資訊進行關鍵字之階層分類	zh_TW
dc.title	Exploiting Content and Social Information for Ontology-based Hierarchical Classification	en
dc.type	Thesis
dc.date.schoolyear	97-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳信希(Hsin-Hsi Chen),許永真(Yung-Jen Hsu),陳倩瑜(Chien-Yu Chen),鄭卜壬(Pu-Jen Cheng)
dc.subject.keyword	本體,階層分類,相似度量測,類別相似度,	zh_TW
dc.subject.keyword	ontology,hierarchical classification,similarity measure,taxonomic similarity,	en
dc.relation.page	37
dc.rights.note	有償授權
dc.date.accepted	2009-01-23
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	934.82 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。