在Instagram上之華人名人發文熱門程度預測

Ting-Yi Su; 蘇庭毅

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22085

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	雷欽隆(Chin-Laung Lei)
dc.contributor.author	Ting-Yi Su	en
dc.contributor.author	蘇庭毅	zh_TW
dc.date.accessioned	2021-06-08T04:01:40Z	-
dc.date.copyright	2018-10-03
dc.date.issued	2018
dc.date.submitted	2018-08-07
dc.identifier.citation	[1] Gelli, F., Uricchio, T., Bertini, M., Del Bimbo, A., & Chang, S. F. (2015, October). Image popularity prediction in social media using sentiment and context features. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 907-910). ACM. [2] Mazloom, M., Hendriks, B., & Worring, M. (2017, October). Multimodal context-aware recommender for post popularity prediction in social media. In Proceedings of the on Thematic Workshops of ACM Multimedia 2017 (pp. 236-244). ACM. [3] Khosla, A., Das Sarma, A., & Hamid, R. (2014, April). What makes an image popular?. In Proceedings of the 23rd international conference on World wide web (pp. 867-876). ACM. [4] Zohourian, A., Sajedi, H., & Yavary, A. (2018, April). Popularity prediction of images and videos on Instagram. In 2018 4th International Conference on Web Research (ICWR) (pp. 111-117). IEEE. [5] Gilbert, C. H. E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth International Conference on Weblogs and Social Media (ICWSM-14). Available at (20/04/16) http://comp.social.gatech.edu/papers/icwsm14.vader.hutto.pdf [6] Chen, T., Borth, D., Darrell, T., & Chang, S. F. (2014). Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks. arXiv preprint arXiv:1410.8586. [7] Jou, B., Chen, T., Pappas, N., Redi, M., Topkara, M., & Chang, S. F. (2015, October). Visual affect around the world: A large-scale multilingual visual sentiment ontology. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 159-168). ACM. [8] Novak, P. K., Smailović, J., Sluban, B., & Mozetič, I. (2015). Sentiment of emojis. PloS one, 10(12), e0144296. [9] Bakhshi, S., Shamma, D. A., & Gilbert, E. (2014, April). Faces engage us: Photos with faces attract more likes and comments on instagram. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 965-974). ACM. [10] Arriaga, O., Valdenegro-Toro, M., & Plöger, P. (2017). Real-time Convolutional Neural Networks for Emotion and Gender Classification. arXiv preprint arXiv:1710.07557. [11] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. [12] Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). ACM.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22085	-
dc.description.abstract	近年來，智慧型手機的普及以及通訊技術的進步使得社群媒體蓬勃發展，人們可以藉由社群媒體互相溝通、分享訊息，甚至自我行銷。對於生活在這個世代的我們而言，沒有人能完全不接觸社群媒體。企業、廣告商或是網路紅人能夠藉由社群媒體提升在人群中的知名度，進而獲得收益。因此，對這部分的群體來說，如何維持甚至使名氣更進一步是一個值得深究的議題。在本篇研究中，我們希望能透過發布貼文的內容，在文章發佈以前預測其在一天之內能夠獲得多少關注。測試的資料來自Instagram，它是一個人們可以透過照片以及些許附註文字來分享跟記錄生活的平台，它強調視覺性內容的分享方式，在青少年中深受喜愛。而目標群體是現實中已經具有一定知名度的人，對他們來說社群媒體上的行銷是一個維持曝光率甚至提升知名度重要的媒介。我們藉由SnowNLP從貼文附註文字中萃取貼文的情緒，並使用DeepSentiBank作為圖片情緒以及圖片內容分類的工具，再加上使用者本身附帶的資訊，建構出能夠預測貼文熱門程度的預測器。我們的預測結果能在測試資料上獲得90.45%的量級準確率，同時與前人的結果相比，我們預測的結果更接近實際的關注數。此外，我們將熱門程度預測視為一個二元分類問題，即熱門以及不熱門，並在此分類問題上獲得88.71%的準確率。	zh_TW
dc.description.abstract	In recent years, social media become much more popular according to the increasing usage of smartphone and the advances in communication technology. People can communicate, share information with others and even self-promotion via social media. To people who live in the era, no one can totally stay away from social media. Enterprises, advertisers and internet celebrities can gain revenue by increasing their popularity among users of social media. Therefore, how to keep the exposure rate or even increase the popularity on social media has become an issue to them. We aim to predict popularity at 24th hour in advance by post content in this study. Our data is from Instagram which is a platform people can record and share their life by photos. It is a popular social media among teenagers due to its image-emphasizing sharing method. And our target users are people who are well-known in the real world. Promotion in social media can be a vital way to keep or increase their popularity for these people. By using SnowNLP, we can retrieve text sentiment from caption of a post. Then we use DeepSentiBank model as an extractor which can get image emotion and image content category. Combining these features with information of user account, we build models to predict popularity. Predicted popularity has 90.45% log scale accuracy on test set. Comparing our result with previous work, we have closer numbers to the actual popularity. Also, we regard popularity prediction as binary classification task which is popular versus unpopular and we obtain 88.71% accuracy on this classification task.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T04:01:40Z (GMT). No. of bitstreams: 1 ntu-107-R05921074-1.pdf: 2665147 bytes, checksum: 372c693a1983a6167265683d75099dcf (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 Chapter 2 Relative work 4 Chapter 3 Background 6 3.1 Instagram 6 3.2 NLP tools and Text sentiment 7 3.2.1 Jieba 7 3.2.2 SnowNLP 7 3.2.3 VADER 8 3.3 Visual sentiment and Face sentiment 9 3.3.1 DeepSentiBank 9 3.3.2 Dlib 10 3.4 Machine Learning modules 11 3.4.1 Scikit-learn 11 Chapter 4 Dataset and Feature engineering 13 4.1 Datasets 13 4.2 Feature engineering 15 4.2.1 Surface features 15 4.2.2 Textual features 16 4.2.3 Visual features 20 4.2.4 Face features 21 Chapter 5 Models 23 5.1 Overall architecture 23 5.2 Used models 24 5.2.1 Random forest 24 5.2.2 XGBoost 24 5.2.3 LightGBM 25 Chapter 6 Evaluation 27 6.1 Evaluation metrics 27 6.1.1 Log-scale accuracy 27 6.1.2 Accuracy 28 6.2 Popularity scale prediction 28 6.3 Popular and unpopular classification 30 6.4 Feature analysis 32 6.4.1 Spearman correlations 32 6.4.2 Performance comparison 35 Chapter 7 Conclusion 36 Bibliography 37 Appendix a A.1 Celebrities’ accounts on Instagram a
dc.language.iso	en
dc.subject	機器學習	zh_TW
dc.subject	社群媒體	zh_TW
dc.subject	圖片特徵	zh_TW
dc.subject	文字情緒	zh_TW
dc.subject	臉部情緒	zh_TW
dc.subject	熱門程度預測	zh_TW
dc.subject	Text sentiment	en
dc.subject	Face emotion	en
dc.subject	Image feature	en
dc.subject	Social Media	en
dc.subject	Popularity prediction	en
dc.subject	Machine Learning	en
dc.title	在Instagram上之華人名人發文熱門程度預測	zh_TW
dc.title	Chinese Celebrity Popularity Prediction on Instagram	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	紀博文(Po-Wen Chi),王銘宏(Ming-Hung Wang)
dc.subject.keyword	社群媒體,機器學習,熱門程度預測,文字情緒,圖片特徵,臉部情緒,	zh_TW
dc.subject.keyword	Social Media,Popularity prediction,Machine Learning,Text sentiment,Image feature,Face emotion,	en
dc.relation.page	40
dc.identifier.doi	10.6342/NTU201802155
dc.rights.note	未授權
dc.date.accepted	2018-08-07
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 未授權公開取用	2.6 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。