Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18327
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor曹承礎
dc.contributor.authorWEN-TAI HSIEHen
dc.contributor.author謝文泰zh_TW
dc.date.accessioned2021-06-08T00:59:56Z-
dc.date.copyright2015-03-13
dc.date.issued2014
dc.date.submitted2015-01-14
dc.identifier.citation[1] Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012, April). The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web (pp. 519-528). ACM.
[2] Liang, T. P., & Turban, E. (2011). Introduction to the special issue social commerce: a research framework for social commerce. International Journal of Electronic Commerce, 16(2), 5-14.
[3] Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge acquisition, 5(2), 199-220.
[4] Chen, H., & Lynch, K. J. (1992). Automatic construction of networks of concepts characterizing document databases. Systems, Man and Cybernetics, IEEE Transactions on, 22(5), 885-902.
[5] Wang, C., Raina, R., Fong, D., Zhou, D., Han, J., & Badros, G. (2011, July). Learning relevance from heterogeneous social network and its application in online targeting. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (pp. 655-664). ACM.
[6] Chau, R., Yeh, C., & Smith, K. A. (2005). A neural network model for hierarchical multilingual text categorization. In Advances in Neural Networks–ISNN 2005 (pp. 238-245). Springer Berlin Heidelberg.
[7] Chen, H., Lally, A. M., Zhu, B., & Chau, M. (2003). HelpfulMed: intelligent searching for medical information over the internet. Journal of the American Society for Information Science and Technology, 54(7), 683-694
[8] Sheth, A., Thomas, C., & Mehra, P. (2010). Continuous semantics to analyze real-time data. IEEE Internet Computing, 14(6), 0084-89.
[9] Sleator, D. D., & Temperley, D. (1995). Parsing English with a link grammar. arXiv preprint cmp-lg/9508004.
[10] Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
[11] Weibel, S. (1995). OCLC/NCSA metadata workshop report. http://www.ifla. org/documents/libraries/cataloging/oclcmeta.htm.
[12] McGuinness, D. L., & Van Harmelen, F. (2004). OWL web ontology language overview. W3C recommendation, 10(10), 2004.
[13] Bozsak, E., Ehrig, M., Handschuh, S., Hotho, A., Maedche, A., Motik, B., ... & Zacharias, V. (2002). KAON—towards a large scale Semantic Web. In E-Commerce and Web Technologies (pp. 304-313). Springer Berlin Heidelberg.
[14] Smith, A. E., & Humphreys, M. S. (2006). Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping. Behavior Research Methods, 38(2), 262-279.
[15] Ahmad, K., & Gillam, L. (2005). Automatic ontology extraction from unstructured texts. In On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE (pp. 1330-1346). Springer Berlin Heidelberg.
[16] Lee, C. S., Kao, Y. F., Kuo, Y. H., & Wang, M. H. (2007). Automated ontology construction for unstructured text documents. Data & Knowledge Engineering, 60(3), 547-566.
[17] Alani, H. (2006, May). Position paper: ontology construction from online ontologies. In Proceedings of the 15th international conference on World Wide Web (pp. 491-495). ACM.
[18] Tijerino, Y. A., Embley, D. W., Lonsdale, D. W., Ding, Y., & Nagy, G. (2005). Towards ontology generation from tables. World Wide Web, 8(3), 261-285.
[19] Moustafa, S., Badr, N., Karam, O., & Gharib, T. (2010). Enriching Ontologies using Coarse-Grained Word Senses. Journal of Egyptian Computer Science, 34(2).
[20] Valarakos, A. G., Paliouras, G., Karkaletsis, V., & Vouros, G. (2004). Enhancing ontological knowledge through ontology population and enrichment. In Engineering knowledge in the age of the Semantic Web (pp. 144-156). Springer Berlin Heidelberg.
[21] Parekh, V., Gwo, J., & Finin, T. W. (2004, June). Mining Domain Specific Texts and Glossaries to Evaluate and Enrich Domain Ontologies. In IKE (pp. 533-540).
[22] Cimiano, P., Hotho, A., & Staab, S. (2005). Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. J. Artif. Intell. Res.(JAIR), 24, 305-339.
[23] Cimiano, P., & Staab, S. (2005). Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In Proceedings of the ICML 2005 Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods.
[24] Ruiz-Casado, M., Alfonseca, E., & Castells, P. (2007). Automatising the learning of lexical patterns: An application to the enrichment of wordnet by extracting semantic relationships from wikipedia. Data & Knowledge Engineering, 61(3), 484-499.
[25] Gharib, T. F., Badr, N. L., Haridy, S., & Abraham, A. (2012). Enriching Ontology Concepts Based on Texts from WWW and Corpus. J. UCS, 18(16), 2234-2251.
[26] Bouquet, P., Serafini, L., & Zanobini, S. (2003). Semantic coordination: a new approach and an application. In The Semantic Web-ISWC 2003 (pp. 130-145). Springer Berlin Heidelberg.
[27] Doan, A., Domingos, P., & Halevy, A. (2003). Learning to match the schemas of data sources: A multistrategy approach. Machine Learning, 50(3), 279-301.
[28] Pei, M., Nakayama, K., Hara, T., & Nishio, S. (2008, March). Constructing a global ontology by concept mapping using wikipedia thesaurus. In Advanced Information Networking and Applications-Workshops, 2008. AINAW 2008. 22nd International Conference on (pp. 1205-1210). IEEE.
[29] Knoth, P., & Herrmannova, D. (2013). Simple yet effective methods for cross-lingual link discovery (CLLD)-KMI@ NTCIR-10 CrossLink-2
[30] Och, F. J., & Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational linguistics, 29(1), 19-51.
[31] Hsieh, W. T., Stu, J., Chen, Y. L., & Chou, S. C. T. (2009). A collaborative desktop tagging system for group knowledge management based on concept space. Expert Systems with Applications, 36(5), 9513-9523.
[32] Wong, P. K., & Chan, C. (1996, August). Chinese word segmentation based on maximum matching and word binding force. In Proceedings of the 16th conference on Computational linguistics-Volume 1 (pp. 200-203). Association for Computational Linguistics.
[33] Tsuruoka, Y., & Tsujii, J. I. (2005, October). Bidirectional inference with the easiest-first strategy for tagging sequence data. In Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 467-474). Association for Computational Linguistics.
[34] Ma, W. Y., & Chen, K. J. (2003, July). Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff. In Proceedings of the second SIGHAN workshop on Chinese language processing-Volume 17 (pp. 168-171). Association for Computational Linguistics.
[35] Jarvelin, K., & Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422-446.
[36] Hsieh, W. T., Ku, T., Wu, C. M., & Chou, S. C. T. (2012, July). Social event radar: a bilingual context mining and sentiment analysis summarization system. In Proceedings of the ACL 2012 System Demonstrations (pp. 163-168). Association for Computational Linguistics.
[37] Sauper, C., & Barzilay, R. (2009, August). Automatically generating wikipedia articles: A structure-aware approach. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1 (pp. 208-216). Association for Computational Linguistics.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18327-
dc.description.abstract由於社群應用興起,群眾共創的資訊暴增,直接影響了所有使用者的資訊接收與決策行為。為了善加利用與分析這些網路的文字資訊,我們需要一個可以因應用領域變更,新詞彙出現,而持續成長的概念空間,做為語意計算的基礎知識庫。本體知識自動建構是一個與本研究相關的重要研究議題,前人的研究包括下列三個方向:從非結構文本自動建構、從既有的本體知識擴充、從半結構化語料庫建構。我們發展了一個可以將社群協作網站作為半結構化語料庫的概念空間自動建構架構,包括專名偵測、過濾、消岐義、專名擴充與排序,我們詳述了研究的步驟以及評估的方法,並在最後針對本體知識自動建構的演進與品質進行討論。我們所提出的架構證實在現實世界的即時資料流動下以自動的方式產生本體知識自動建構是可行的,而且無論是跨語言、跨領域或是在處理即時資料上較前人研究上提供了更廣的資料涵蓋範圍,有別於過去單一方法用於單一應用,本論文所採用的架構可提供更多實際的應用。zh_TW
dc.description.abstractWith the prevalence of Social Networking Services (SNS), real-world consumption behaviors are influenced from reality to social networks. In order to utilize the information from social network, we need a concept space that can alter with application domain. In the presence of new vocabulary and continuously growing, and automatic ontology construction has been an important issue. There are previous studies concerning free-format ontology construction, enriching given ontologies from web or corpus sources, and construction of ontology from semi-structural corpora; among these studies, semi-structural corpora have been prevailing studies. In this thesis, we developed an adaptive framework for cross corpora on social collaborative editing, and we focus on semi-structural text mining in particular. The framework involves detection of named entity in a document, filtering of named entity, disambiguation detection, named entity expansion and ranking of the related named entity. We describe how this framework in detail and proposed method for each stage, and the metrics in the previous studies and the one we used for evaluation. We then discuss the evolution and quality of concept space. Our proposed framework made real-world corpora computationally possible, and a dynamic concept space is generated from this framework. It could deal with more diverse domains and languages, and for pragmatic real-world applications, our method shows better flexibility than previous studies.en
dc.description.provenanceMade available in DSpace on 2021-06-08T00:59:56Z (GMT). No. of bitstreams: 1
ntu-103-D95725001-1.pdf: 4167346 bytes, checksum: 41ba1bf92b2d52b79cebb3e2ecf6ad9c (MD5)
Previous issue date: 2014
en
dc.description.tableofcontents口試委員審定書 i
中文摘要 ii
THESIS ABSTRACT iii
CONTENTS v
LIST OF FIGURES viii
LIST OF TABLES xi
Chapter 1 Introduction 1
1.1 Social Networks and Marketing 1
1.2 Knowledge Work Task 2
1.2.1 Analyze/Organize 3
1.2.2 Find 3
1.2.3 Physical features.................... 4
1.3 Concept Space 5
1.4 Research Goal 5
Chapter 2 Related Work 7
2.1 Link Grammar, WordNet and Ontology 7
2.1.1 Link Grammar 7
2.1.2 WordNet 12
2.1.3 Ontology 13
2.2 Automatic Construction of Ontology with free format texts 18
2.3 Enriching ontology with information extraction from Web/Corpus 19
2.4 Building ontology with semi-structural corpora 21
Chapter 3 Constructing Concept Space 24
3.1 Problem Statement 24
3.2 Anchor Detection 24
3.3 Anchor Filtering and Disambiguation 26
3.4 Expansion and Ranking 26
3.4.1 Training Stage 27
3.4.2 Run-Time Stage 27
Chapter 4 Evaluation 30
4.1 Evaluation on Concept Space 30
4.2 Preliminary Evaluation 49
4.2.1 Experimental Setting 49
4.2.2 Experimental Setting 50
4.2.3 Preliminary Results 51
4.3 Evaluation for Expansion and Ranking 54
4.3.1 Experimental Setting 54
4.3.2 Evaluation 54
Chapter 5 Discussion 56
5.1 Evolution of Corpus: Wikipedia vs. Concept Space 56
5.2 Quality of Concept Space 61
5.2.1 Frequently Edited Entry Articles 62
5.2.2 Non-frequently Edited Entry Articles 64
Chapter 6 Conclusion 66
6.1 Adaptive Framework for Constructing Concept Space across Different Corpora 66
6.2 The Evolution of the Corpora and the Quality of the Concept Space 68
6.3 Contribution and Future Work 69
6.3.1 The Corpus Type Diversity 69
6.3.2 Automatic Entry Point Selection 70
6.3.3 The NER Precision 70
REFERENCES 71
LIST OF FIGURES
Fig. 1.1 Information analysis and retrieval framework 3
Fig. 2.1 Words and connectors in a dictionary 7
Fig. 2.2 All linking requirements are satisfied 8
Fig. 2.3 A simplified form of Figure. 2 8
Fig. 2.4 The link grammar output Linkage example 1 11
Fig. 2.5 The link grammar output Linkage example 2 11
Fig. 2.6 The Link Grammar output sample through Phrase Parser component -1 12
Fig. 2.7 The Link Grammar output sample through Phrase Parser component -2 12
Fig. 2.8 An example of Ontology using KAON component -2 18
Fig. 2.9 Flowchart of automatically constructing ontology via free format text mining 19
Fig. 2.10 Flowchart of building concept space from Wikipedia 22
Fig. 3.1 CRF model training 25
Fig. 3.2 Extracting keywords at run-time 28
Fig. 4.1 Average adopted tags of training data 31
Fig. 4.2 Average recommended tags of training data 32
Fig. 4.3 Tag adoption rate of two users 35
Fig. 4.4 Average adoption rate of documents with same frequency tag 36
Fig. 4.5 Average adoption rate of documents with same frequency tag pair 37
Fig. 4.6 Average adoption rate of collection of documents with same frequency tag pair after removing low inter-similarity collections 39
Fig. 4.7 Relevant search experimental procedures 39
Fig. 4.8 Average relevant search result of top 10 tags 40
Fig. 4.9 Precision, recall of relevant search, keyword search for top 10 42
Fig. 4.10 F-measure of relevant search and keyword search for top 10 44
Fig. 4.11 The example of matrix map in patent analysis 45
Fig. 4.12 Steps in patent analysis scenario 46
Fig. 4.13 Tagging interface of DCT system 47
Fig. 4.14 The attributes merging concept 47
Fig. 4.15 Comparisons of Interpolated Precision-Recall (C2E GT F2F) 53
Fig. 4.16 Comparisons of Interpolated Precision-Recall (C2E MA A2F) 53
Fig. 5.1 Screenshot of the beginning edition of the term “太陽花學運” in Wikipedia 56
Fig. 5.2 Screenshot of the latest edition of the term “太陽花學運” in Wikipedia 56
Fig. 5.3 Summary of the edition history of the term “太陽花學運” in Wikipedia 57
Fig. 5.4 Concept space the term “太陽花學運” in Wikipedia, March 18, 2014 57
Fig. 5.5 Concept space the term “太陽花學運” in Wikipedia, April 1st, 2014 58
Fig. 5.6 Concept space the term “太陽花學運” in Wikipedia, April 10, 2014 59
Fig. 5.7 Search results of three student leaders during the movement and comparison with concept space relatedness 60
Fig. 5.8 Search results of symbolic concepts during the movement and comparison with concept space relatedness 60
Fig. 5.9 Watch list of edited entry articles on Wikipedia 62
Fig. 5.10 Editing changes and statistics of “Samsung Galaxy S5” 62
Fig. 5.11 Editing changes and statistics of “iPhone 5S” 63
Fig. 5.12 Editing changes and statistics of “HTC One M8” 63
Fig. 5.13 Editing changes of “HTC One M8” after five days 64
Fig. 5.14 Editing changes of “聖元國際” after four days 65
 
LIST OF TABLES
Table 2.1 The words and linking requirements in a dictionary. 8
Table 2.2 WordNet Synset sample of “Add” 13
Table 4.1 Participants in NTCIR-10 CLLD. 35
Table 4.2 Use frequency of other tags except shared tag pair. 38
Table 4.3 Performance of relevant search and keyword search 43
Table 4.4 Performance of relevant search and keyword search. 48
Table 4.5 Participants in NTCIR-10 CLLD. 51
Table 4.6 CJK2E F2F evaluation with Wikipedia ground- truth: LMAP, R-PREC. 52
Table 4.7 CJK2E F2F evaluation with manual assessment results: LMAP, R-PREC. 52
Table 4.8 F2F evaluation with Wikipedia ground-truth: Precision-at-N (Chinese-to-English). 52
Table 4.9 System performance at (a) N=5 (b) N=7 (c) N=10. 55
dc.language.isoen
dc.title以協作資料為基礎的概念空間自動建構研究zh_TW
dc.titleConstructing Concept Space from Social Collaborative Editingen
dc.typeThesis
dc.date.schoolyear103-1
dc.description.degree博士
dc.contributor.oralexamcommittee苑守慈,蔡益坤,林俊叡,陳鴻基
dc.subject.keyword概念空間,本體知識,社群協作,zh_TW
dc.subject.keywordConcept Space,Automatic Ontology Construction,Social Collaborative Editing,en
dc.relation.page75
dc.rights.note未授權
dc.date.accepted2015-01-15
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-103-1.pdf
  未授權公開取用
4.07 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved