應用時間演化主題多管道潛藏狄利克雷分配推薦主題標籤

Chien-Hua Lee; 李健華

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5229

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	盧信銘(Hsin-Min Lu)
dc.contributor.author	Chien-Hua Lee	en
dc.contributor.author	李健華	zh_TW
dc.date.accessioned	2021-05-15T17:53:58Z	-
dc.date.available	2015-08-14
dc.date.available	2021-05-15T17:53:58Z	-
dc.date.copyright	2014-08-14
dc.date.issued	2014
dc.date.submitted	2014-07-29
dc.identifier.citation	[1] Anderson, C., and Hiralall, M. (2011) Recommender systems for e-shops. [2] Balabanović, M., and Shoham, Y. (1997). Fab: content-based, collaborative recommendation. Communications of the ACM, 40(3), 66-72. [3] Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993-1022. [4] Breese, J. S., Heckerman, D., and Kadie, C. (1998, July). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (pp. 43-52). Morgan Kaufmann Publishers Inc.. [5] Chen, Y. H., and George, E. I. (1999, January). A bayesian model for collaborative filtering. In Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics. San Francisco: Morgan Kaufman Publishers, [http://uncertainty99.microsoft.com/proceedings.htm]. [6] Erosheva, E., Fienberg, S., and Lafferty, J. (2004). Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5220-5227. [7] Geman, S., and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6), 721-741. [8] Gilks, W. R., Richardson, S., and Spiegelhalter, D. J. Markov Chain Monte Carlo in Practice. 1996. New York: Chapman Hall/CRC, 486. [9] Godin, F., Slavkovikj, V., De Neve, W., Schrauwen, B., and Van de Walle, R. (2013, May). Using topic models for Twitter hashtag recommendation. In Proceedings of the 22nd international conference on World Wide Web companion (pp. 593-596). International World Wide Web Conferences Steering Committee. [10] Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. (1992). Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35(12), 61-70. [11] Griffiths, T. L., and Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences of the United States of America, 101(Suppl 1), 5228-5235. [12] Herlocker, J. L., Konstan, J. A., Terveen, L. G., and Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), 22(1), 5-53. [13] Hotho, A., Jaschke, R., Schmitz, C., and Stumme, G. (2006, January). Folkrank: A ranking algorithm for folksonomies. In K. D. Althoff (Ed.), LWA (Vol. 1, pp. 111-114). [14] Koren, Y., Bell, R., and Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37. [15] Kywe, S. M., Hoang, T. A., Lim, E. P., and Zhu, F. (2012). On recommending hashtags in twitter networks. In Social Informatics (pp. 337-350). Springer Berlin Heidelberg. [16] Li, R., Wang, S., Deng, H., Wang, R., & Chang, K. C. C. (2012, August). Towards social user profiling: unified and discriminative influence model for inferring home locations. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1023-1031). ACM. [17] Liu, J. S. Monte Carlo Strategies in Scientific Computing. 2001. NY: Springer. [18] Mazzia, A., and Juett, J. (2009). Suggesting hashtags on twitter. [19] Newman, M. E., Barkema, G. T., and Newman, M. E. J. (1999). Monte Carlo methods in statistical physics (Vol. 13). Oxford: Clarendon Press. [20] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J. (1994, October). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work (pp. 175-186). ACM. [21] Salton, G., and McGill, M. J. (1983). Introduction to modern information retrieval. [22] Sarwar, B., Karypis, G., Konstan, J., and Riedl, J. (2001, April). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295). ACM. [23] Uddin, M. M., Hassan, M. T., and Karim, A. (2011, December). Personalized versus non-personalized tag recommendation: A suitability study on three social networks. In Multitopic Conference (INMIC), 2011 IEEE 14th International(pp. 56-61). IEEE. [24] Wang, X., & McCallum, A. (2006, August). Topics over time: a non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 424-433). ACM. [25] Zangerle, E., Gassler, W., and Specht, G. (2011). Recommending#-tags in twitter. In Proceedings of the Workshop on Semantic Adaptive Social Web (SASWeb 2011). CEUR Workshop Proceedings (Vol. 730, pp. 67-78).
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5229	-
dc.description.abstract	隨著社交網絡的盛行，有越來越多的使用者加入，其中所包含的資訊量更是迅速的成長。為了有效且快速的分類和搜尋推文(tweet)，推特(Twitter)的用戶使用主題標籤(hashtag)來標記並歸類推文。由於添加主題標籤不是一項自動化的程序，絕大部分的推文都沒有使用主題標籤，在我們的研究中更只有15%的推文有使用，大大的降低其價值。故本研究希望提出一個主題標籤的推薦系統，在使用者輸入完推文後，能自動產生一組合適的主題標籤以供選擇，提升主題標籤的覆蓋率。本研究以主題模型(topic model)為基礎，加入時間群集(temporal clustering)的方法，提出時間演化主題多管道潛藏狄利克雷分配(Topic over Time Multiple Channel Latent Dirichlet Allocation，簡稱TOT-MCLDA)。此模型根據可觀察的推文資訊，針對不同時間下的潛藏主題做分群，並預測適合的主題標籤。本研究使用三年期的推特資料進行實驗，實驗結果證明TOT-MCLDA表現優於先前研究所提出的推薦系統，能顯著的提升推薦的準確率。此外，TOT-MCLDA所建立之推文與主題標籤之間的關聯，也可作為基礎應用於其他相關研究上，增加可信度。	zh_TW
dc.description.abstract	Along with the development of social network and the sustainable user growth, the explosion of contents provides tons of information. In order to efficiently and effectively classify tweets, users of Twitter can make use of hashtags to mark and categorize their tweets. However, most of the tweets do not contain hashtags. In addition, our research shows that there are only 15% of tweets contain hashtags, which greatly reduce the value of hashtags. Therefore, our research aims to develop a hashtag recommendation system to automatically provide hashtags according to the content of the tweet. Our research mode is constructed based on Mixed Membership Model. We further extend the model by incorporating the temporal clustering effect and propose the result model, Topics over Time Multiple Channel Latent Dirichlet Allocation (TOT-MCLDA). The insight of our model is that the text words and hashtags from one tweet have the same latent topic condition factors. In addition, tweets posted in the same period of time have higher relevance. Hence, we can make use of the tweet contents to recommend hashtags by its latent topics. Experimental results on a 3-year Twitter dataset demonstrate that the proposed method can outperform some state-of-the-art methods.	en
dc.description.provenance	Made available in DSpace on 2021-05-15T17:53:58Z (GMT). No. of bitstreams: 1 ntu-103-R01725004-1.pdf: 1179589 bytes, checksum: 32dcfabdb08e090b68678db1e2f5ac89 (MD5) Previous issue date: 2014	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii Abstract iii Contents iv List of Figures vi List of Tables viii Chapter 1 Introduction 1 Chapter 2 Literature Review 5 2.1 Recommendation Systems 5 2.1.1 Non-personalized Recommendation Systems 5 2.1.2 Personalized Recommendation Systems 6 2.1.2.1 Memory-based Collaborative Filtering 7 2.1.2.2 Model-based Collaborative Filtering 8 2.1.2.3 Content-based approach 9 2.2 Hashtag Recommendation Systems 11 2.2.1 Naive Bayes Method 11 2.2.2 Similarity Approach 13 2.2.3 Topic Model Based Method 17 2.3 Summary 24 Chapter 3 Methodology 25 3.1 TOT-MCLDA 25 3.1.1 Topics over Time 25 3.1.2 Mixed Membership Model 29 3.1.3 Topics over Time Multiple Channel Latent Dirichlet Allocation 31 3.2 Baseline Model 41 3.3 Metrics 42 Chapter 4 Data Selection and Experimental Results 43 4.1 Data Selection and Preprocess 43 4.2 Performance Evaluation 46 4.3 Parameter estimation 47 4.4 Experimental results of Twitter data 48 4.4.1 Analyses of Recommendation Lists 55 4.4.2 Analyses of Topic Distributions 68 Chapter 5 Conclusions and Future Work 80 Reference 82 Appendix A 86
dc.language.iso	en
dc.subject	主題模型	zh_TW
dc.subject	社交網絡	zh_TW
dc.subject	推特	zh_TW
dc.subject	主題標籤	zh_TW
dc.subject	推薦系統	zh_TW
dc.subject	Topic Model	en
dc.subject	Social Network	en
dc.subject	Twitter	en
dc.subject	Hashtags	en
dc.subject	Recommendation System	en
dc.title	應用時間演化主題多管道潛藏狄利克雷分配推薦主題標籤	zh_TW
dc.title	Recommending Hashtags Using Topics over Time Multiple Channel Latent Dirichlet Allocation	en
dc.type	Thesis
dc.date.schoolyear	102-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李昇暾(Sheng-Tun Li),曹承礎(Seng-Cho Chou)
dc.subject.keyword	社交網絡,推特,主題標籤,推薦系統,主題模型,	zh_TW
dc.subject.keyword	Social Network,Twitter,Hashtags,Recommendation System,Topic Model,	en
dc.relation.page	89
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2014-07-30
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-103-1.pdf	1.15 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。