運用短文件分類技術改良微網誌政府服務之研究

Chih-Yu Lei; 雷智宇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6802

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳建錦(Chien-Chin Chen)
dc.contributor.author	Chih-Yu Lei	en
dc.contributor.author	雷智宇	zh_TW
dc.date.accessioned	2021-05-17T09:18:25Z	-
dc.date.available	2013-07-19
dc.date.available	2021-05-17T09:18:25Z	-
dc.date.copyright	2012-07-19
dc.date.issued	2012
dc.date.submitted	2012-07-16
dc.identifier.citation	Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., and Spyropoulos, D. (2000). Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach. In Proceedings of the 4th PKDD’s Workshop on Machine Learning and Textual Information Access. pp.1-13. Chen, M., Jin, X. and Shen, D. (2011). Short Text Classification Improved by Learning Multi-Granularity Topics. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Spain, Barcelona. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., and Slattery, S. (2000).Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence.118(1-2):69-114. Gabrilovich, E. and Markovitch, S. (2007). Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In Proceedings of the International Joint Conference on Artificial Intelligence, pp.1606-1611. India, Hyderabad. Ghaharhani, Saeed. (2004) Fundamentals of Probability. 3rd Edition. Pearson - Prentice Hall. Hao, L. and Hao, L. (2008). Temporal Data Driven Naive Bayes Text Classifier. The 9th International Conference for Young Computer Scientist. pp.699-702. China, Hunan. Hu, X., Zhang X., Lu C., Park E. K., and Zhou X. (2009). Exploiting Wikipedia as External Knowledge for Document Clustering. In Proceedings of KDD, pp.389–396. France, Paris. 37 Java, A., Song, X., Finn, T., and Tseng, B. (2007).Why we Twitter: Understanding Microblogging Usage and Communities. In Proceedings of the Joint 9th WEBKDD and 1st SNA-KDD Workshop. pp.56-65. USA, California, San Jose. Keller, G. (2005). Statistics for Management and Economics. 7th International Edition, International Thomson. Kelman, H. C. (1958).Compliance, Identification, and Internalization: Three Processes of Attitude Change, Journal of Conflict Resolution, Vol. 2. 51-60. Liu, Y., Scheuermann, P., Li, X., and Zhu, X., (2007). Using WordNet to Disambiguate Word Senses for Text Classification ICCS 2007, Part III, LNCS 4489, pp. 780–788. Manning, C. D., Raghavan, P. and Schutze, H. (2008). Introduction to Information Retrieval, Cambridge University Press. Nam, T. (2011). New Ends, New Means, but Old Attitudes: Citizens’ Views on Open Government and Government 2.0. In Proceedings of the 44th Hawaii International Conference on System Sciences. pp.1-10. Phan, X., Nguyen, L. and Horiguchi, S. (2008). Learning to Classify Short and Sparse Test & Web with Hidden Topics from large-scale Data collection. In Proceedings of the 17th International World Wide Web Conference. pp.91-100. China, Beijing. Revesz, P. and Triplet, T. (2011). Temporal Data Classification using Linear Classifiers. Information Systems Journal Volume: 36, Issue: 1, Publisher: Elsevier, pp.30-41 Rocchio Jr, J.J. (1971). Relevance feedback in Information Retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall. Rodriguez, Manuel de Buenaga; Gomez-Hidalgo, Jose Maria; and Diaz-Agudo, Belen. (1997). Using WordNet to complement training information in text categorization. Recent Advances in Natural Language Processing-97. 150-157. 38 Sahami, M., and Heilman, T. (2006). A Webbased Kernel Function for Measuring the Similarity of Short Text Snippets. In Proceedings of the 15th International World Wide Web Conference. pp.377-386. Scotland, Edinburgh. Salon, G. and Mcgill, M. 1. (1983). Introduction to Modern Information Retrieval. McGraw-Hill Book Co., NewYork. Sankaranarayanan, J., Samet, H., Teitler, B. E., Lieberman, M. D., and Sperling, J. (2009). TwitterStand: News in Tweets. Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp.42–51, USA, Washington, Seatle. Schuler, Doug (2001). Cultivating Society's Civic Intelligence. Journal of Society, Information and Communication, Vol. 4 No. 2. Scott, Sam and Matwin, Stan. (1998). Text classification using WordNet hypernyms. Usage of WordNet in Natural Language Processing Systems: Proceedings of the Workshop. 45-51. Shen, D., Pan, R., Sun, J. T., Pan, J. J., Wu, K., Yin, J., and Yang, Q. (2006). Query Enrichment for Web-query Classification. Journal of ACM Transaction on Information Systems. 24:320–352. Voorhees, E. (1993). Using WordNet to Disambiguate Word Senses for Text Retrieval. In Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval., pp171-180. Pittsburgh, USA Widrow, B. and Sterns, S. (1985) . Adaptative Signal Processing. Prentice-Hall. Wigand, F. D. L. (2010). Twitter Takes Wing in Government: Diffusion, Roles, and Management . In Proceedings. Digital Government Society of North America, pp.66–71. Wigand, F. D. L. (2010). Twitter in Government: Building Relationship One Tweet at a Time. In Proceedings of the 7th International Conference on Information Technology. pp.563–567. USA, Nevada, Las Vegas. Yih, W. and Meek, C. (2007). Improving Similarity Measures for Short Segments of Text. In Proceedings of Association for the Advancement of Artificial Intelligence. pp.1489–1494. Canada, Vancouver. Zelikovitz, S. and Hirsh, H. (2000).Improving Short-Text Classification Using Unlabeled Background Knowledge to Assess Document Similarity. In Proceedings of the 17th International Conference on Machine Learning, pp.1183–1190 USA, Stanford University. Zelikovitz, S. and Hirsh, H. (2003). Integrating Background Knowledge Into Text Classifiers. In Proceedings of the International Joint Conference of Artificial Intelligence. Mexico,Acapulco.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6802	-
dc.description.abstract	Web2.0的興起提供了政府以及市民們一條直接溝通的管道。近年來此項技術中，微網誌服務如推特(Twitter)已經成為了當今最熱門的應用之一。許多政府服務也在一些議題中利用微網誌從社會大眾蒐集各式各樣的見解。一般來說，市民們對於可以利用這類媒體來訴求他們的意見感到滿意進而提升人民對政府的信任程度。然而，隨著近年來微網誌的數量呈現指數型成長，因此勢必需要運用文字探勘相關技術來有效率的分析這些大量短文章。在這篇研究中，我們對於微網誌政府服務提出了一個有效率的分類架構。為了去解決微網誌的資料稀疏性問題，運用外部知識庫以及微網誌本身提供的時間資訊來修改原始貝式分類模型中的事前以及條件機率。利用311NYC 資料集的實驗顯示出我們所提出的分類架構可以準確的分類市民對於政府服務的見解，並且對貝式分類模型有顯著的改善。	zh_TW
dc.description.abstract	The prevalence of Web2.0 techniques enables governments and citizens to communicate in a direct manner. Among the current Web2.0 applications, micro-blogging services, such as Twitter, are the most popular and many government services now exploit micro-blogs to collect opinions from the public on a range of issues. Generally, citizens are satisfied with this medium for expressing their opinions; however, as the number of micro-blogs is increasing exponentially, text mining is needed to analyze the opinions efficiently. In this paper, we propose an efficient classification framework for micro blog-based government services. To address the text sparseness problem of micro-blogs, an external knowledge base and the temporal information of micro blogs are used to modify the prior and conditional probabilities of the Naive Bayes classification model. Experiments based on the 311NYC dataset show that the proposed framework classifies citizens’ opinions about government services correctly, and it achieves a significant improvement over the Naive Bayes model.	en
dc.description.provenance	Made available in DSpace on 2021-05-17T09:18:25Z (GMT). No. of bitstreams: 1 ntu-101-R99725001-1.pdf: 690155 bytes, checksum: 42cc14dd8cc4b94565b68adf2d25e436 (MD5) Previous issue date: 2012	en
dc.description.tableofcontents	致謝................................................1 論文摘要............................................ 2 THESIS ABSTRACT ................................... 3 Table of Contents ................................. 4 List of Figures ................................... 5 List of Tables .................................... 6 1. Introduction ................................... 7 2. Related Works .................................. 11 2.1 Government Services and Web2.0 Techniques ..... 11 2.2 Short-text Classification ..................... 12 2.3 Using WordNet to Improve Classification ....... 14 3. Methodology .................................... 15 3.1 The Naive Bayes Classification Model .......... 16 3.2 Using an External Knowledge Base .............. 17 3.3 Temporal Information about Tweets ............. 19 3.4 Smoothing Techniques for Temporal Information . 22 3.5 User Content Enrichment ....................... 24 3.6 Using theWordNet Synonym Dictionary .................... 25 4. Performance Evaluation ......................... 27 4.1 Dataset Collection and Evaluation Metrics ..... 27 4.2 Performance Evaluations ....................... 29 5. Conclusion and Future Works .................... 35 References ........................................ 36
dc.language.iso	en
dc.title	運用短文件分類技術改良微網誌政府服務之研究	zh_TW
dc.title	An Efficient Classification Framework for Micro Blog-based Government Services	en
dc.type	Thesis
dc.date.schoolyear	100-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	戴碧如(Bi-Ru Dai),陳孟彰(Meng-Chang Chen),李正帆(Jeng-Farn Lee)
dc.subject.keyword	電子化政府,文字探勘,分類,	zh_TW
dc.subject.keyword	E-Government,Government2.0,Text Mining,Classification,	en
dc.relation.page	39
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2012-07-16
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf	673.98 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。