觀眾留言應用於直播影片精彩擷取之深度學習模型

Hung-Kuang Han; 韓宏光

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74447

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	陳建錦(Chien Chin Chen)
dc.contributor.author	Hung-Kuang Han	en
dc.contributor.author	韓宏光	zh_TW
dc.date.accessioned	2021-06-17T08:36:18Z	-
dc.date.available	2020-08-20
dc.date.copyright	2019-08-20
dc.date.issued	2019
dc.date.submitted	2019-08-08
dc.identifier.citation	[1] BARBIERI, F., ANKE, L.E., BALLESTEROS, M., SOLER, J., and SAGGION, H., 2017. Towards the understanding of gaming audiences by modeling twitch emotes. In Proceedings of the 3rd Workshop on Noisy User-generated Text, 11-20. [2] CHUNG, J., GULCEHRE, C., CHO, K., and BENGIO, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Deep Learning and Representation Learning Workshop. [3] CRESWELL, A., ARULKUMARAN, K., and BHARATH, A.A., 2017. On denoising autoencoders trained to minimise binary cross-entropy. [4] DUPREZ, C., CHRISTOPHE, V.R., RIMÃ©, B., CONGARD, A., and ANTOINE, P., 2015. Motives for the social sharing of an emotional experience. Journal of Social and Personal Relationships 32, 6, 757-787. [5] ESCORCIA, V., HEILBRON, F.C., NIEBLES, J.C., and GHANEM, B., 2016. Daps: Deep action proposals for action understanding. In European Conference on Computer Vision Springer, 768-784. [6] FU, C.-Y., LEE, J., BANSAL, M., and BERG, A.C., 2017. Video highlight prediction using audience chat reactions. In Proceedings of the Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (2017), 972-978. [7] GOYAL, S., 2016. Sentimental Analysis of Twitter Data using Text Mining and Hybrid Classification Approach. International Journal of Advance Research, Ideas and Innovations in Technology 2, z5, 2454-2132X. [8] HUA, X.-S., LU, L., ZHANG, H.-J., and DISTRICT, H., 2005. A generic framework of user attention model and its application in video summarization. IEEE Transaction on multimedia 7, 5, 907-919. [9] JIANG, R., QU, C., WANG, J., WANG, C., and ZHENG, Y., Lightor: Video Highlight Detection for Live Streaming Platforms Using Implicit Crowdsourcing. [10] JIAO, Y., LI, Z., HUANG, S., YANG, X., LIU, B., and ZHANG, T., 2018. Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection. IEEE Transactions on Multimedia 20, 10, 2693-2705. [11] KRIZHEVSKY, A., SUTSKEVER, I., and HINTON, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097-1105. [12] LIPTON, Z.C., BERKOWITZ, J., and ELKAN, C., 2015. A critical review of recurrent neural networks for sequence learning. [13] MIKOLOV, T., CHEN, K., CORRADO, G., and DEAN, J., 2013. Efficient estimation of word representations in vector space. [14] MOHER, M., 1993. Decoding via cross-entropy minimization. In Proceedings of GLOBECOM'93. IEEE Global Telecommunications Conference IEEE, 809-813. [15] NEPAL, S., SRINIVASAN, U., and REYNOLDS, G., 2001. Automatic detection of'Goal'segments in basketball videos. In Proceedings of the ninth ACM international conference on Multimedia ACM, 261-269. [16] NGO, C.-W., MA, Y.-F., and ZHANG, H.-J., 2005. Video summarization and scene detection by graph modeling. IEEE Transactions on Circuits and Systems for Video Technology 15, 2, 296-305. [17] OTSUKA, I., NAKANE, K., DIVAKARAN, A., HATANAKA, K., and OGAWA, M., 2005. A highlight scene detection and video summarization system using audio feature for a personal video recorder. IEEE Transactions on Consumer Electronics 51, 1, 112-116. [18] RINGER, C. and NICOLAOU, M.A., 2018. Deep unsupervised multi-view detection of video game stream highlights. In Proceedings of the 13th International Conference on the Foundations of Digital Games ACM, 15. [19] ROCHAN, M., YE, L., and WANG, Y., 2018. Video summarization using fully convolutional sequence networks. In Proceedings of the European Conference on Computer Vision (ECCV), 347-363. [20] RONG, X., 2014. word2vec parameter learning explained. [21] RUI, Y., GUPTA, A., and ACERO, A., 2000. Automatically extracting highlights for TV baseball programs. In Proceedings of the eighth ACM international conference on Multimedia ACM, 105-115. [22] SCHUSTER, M. and PALIWAL, K.K., 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11, 2673-2681. [23] SIGARI, M.-H., SOLTANIAN-ZADEH, H., and POURREZA, H.-R., 2015. Fast highlight detection and scoring for broadcast soccer video summarization using on-demand feature extraction and fuzzy inference. International Journal of Computer Graphics 6, 1, 13-36. [24] SONG, Y., 2016. Real-time video highlights for yahoo esports. In NIPS Workshop, LSCVS 2016. [25] SUN, M., FARHADI, A., and SEITZ, S., 2014. Ranking domain-specific highlights by analyzing edited videos. In European conference on computer vision Springer, 787-802. [26] TANG, A. and BORING, S., 2012. #EpicPlay: crowd-sourcing sports video highlights. In Proceedings of the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA2012), ACM, 2208622, 1569-1572. DOI= http://dx.doi.org/10.1145/2207676.2208622. [27] TJONDRONEGORO, D., CHEN, Y.-P.P., and PHAM, B., 2004. Highlights for more complete sports video summarization. IEEE multimedia 11, 4, 22-37. [28] TRUONG, B.T. and VENKATESH, S., 2007. Video abstraction: A systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3, 1, 3. [29] WEBB, G.I. and CONILIONE, P., 2005. Estimating bias and variance from data. In Pre-publication manuscript (http://www. csse. monash. edu/webb/-Files/WebbConilione06. pdf). [30] XU, C., WANG, J., WAN, K., LI, Y., and DUAN, L., 2006. Live sports event detection based on broadcast video and web-casting text. In Proceedings of the 14th ACM international conference on Multimedia ACM, 221-230. [31] XU, H., DAS, A., and SAENKO, K., 2017. R-c3d: Region convolutional 3d network for temporal activity detection. In Proceedings of the IEEE international conference on computer vision, 5783-5792. [32] YANG, H., WANG, B., LIN, S., WIPF, D., GUO, M., and GUO, B., 2015. Unsupervised extraction of video highlights via robust recurrent auto-encoders. In Proceedings of the IEEE international conference on computer vision, 4633-4641. [33] YAO, T., MEI, T., and RUI, Y., 2016. Highlight detection with pairwise deep ranking for first-person video summarization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 982-990. [34] ZHANG, B., DOU, W., and CHEN, L., 2006. Combining short and long term audio features for TV sports highlight detection. In European Conference on Information Retrieval Springer, 472-475.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74447	-
dc.description.abstract	現今直播已經成為人們學習新事物時隨手可得的管道之一。儘管直播影片吸引了大量的觀眾，但在長時間的影片內容中，只包含少部分的精彩片段，其他大部分時間則為相對無趣的內容。基於此現象，大量的人工智慧研究者開始開發深度學習模型來自動擷取直播影片的精彩片段。相關研究中，大部分都是基於影像畫面的視覺分析來擷取精彩片段，鮮少有研究使用觀眾留言。因此，在這篇論文中，我們提出一個新的深度學習模型來分析觀眾留言內容，藉由觀眾留言傳達出的訊息來擷取影片中的精彩片段。我們將模型應用在收集自數個不同Twitch直播頻道的影片庫上來評估模型表現，最終得到模型的精確度為51.3%，優於我們所比較的基線模型。	zh_TW
dc.description.abstract	Live streaming has become a ubiquitous channel for people to learn new happenings. Although live streaming videos generally attract a large audience of watchers, their contents are long and contain relatively unexciting stretches of knowledge transmission. This observation has prompted artificial intelligence researchers to establish advanced models that automatically extract highlights from live streaming videos. Most streaming highlight extraction research has been based on visual analysis of video frames, and seldom have studies considered the messages posted by the viewer-audience. In this paper, we propose a deep learning model that examines the messages posted by streaming audiences. The video segments whose messages reveal audience excitement are extracted to compose the highlights of a streaming video. We evaluate our model in terms of multiple Twitch streaming channels. The precision of our highlight extraction model is 51.3% and is superior to several baseline methods.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T08:36:18Z (GMT). No. of bitstreams: 1 ntu-108-R06725053-1.pdf: 989692 bytes, checksum: 16ba3b42e588b88dc500373371f2c3fb (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	摘要 i ABSTRACT ii TABLE OF CONTENTS iii LIST OF FIGURES iv LIST OF TABLES v 1 INTRODUCTION 1 2 RELATED WORKS 4 3 HIGHLIGHT EXTRACTION SYSTEM 7 3.1 Model Learning 8 3.1.1 Training Data Preparation 8 3.1.2 The biGRU-DNN Model 10 3.2 Highlight Extraction 12 4 EXPERIMENT 13 4.1 Datasets and Evaluation Metrics 13 4.2 Effects of System Parameters on Model Training 15 4.3 Comparisons with other Methods 16 5 CONCLUSION 18 6 ACKNOWLEDGMENTS 18 7 REFERENCES 18
dc.language.iso	en
dc.subject	深度學習	zh_TW
dc.subject	群眾外包	zh_TW
dc.subject	精彩剪輯	zh_TW
dc.subject	直播	zh_TW
dc.subject	雙向GRU	zh_TW
dc.subject	循環神經網路	zh_TW
dc.subject	Highlight Extraction	en
dc.subject	Live Streaming	en
dc.subject	Bidirectional GRU	en
dc.subject	Crowdsourcing	en
dc.subject	Recurrent Neural Networks	en
dc.subject	Deep Learning	en
dc.title	觀眾留言應用於直播影片精彩擷取之深度學習模型	zh_TW
dc.title	A Deep Learning Model for Extracting Live Streaming Video Highlights using Audience Messages	en
dc.type	Thesis
dc.date.schoolyear	107-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	張詠淳(Yung-Chun Chang),陳孟彰(Meng Chang Chen)
dc.subject.keyword	深度學習,循環神經網路,雙向GRU,直播,精彩剪輯,群眾外包,	zh_TW
dc.subject.keyword	Deep Learning,Recurrent Neural Networks,Bidirectional GRU,Live Streaming,Highlight Extraction,Crowdsourcing,	en
dc.relation.page	22
dc.identifier.doi	10.6342/NTU201902939
dc.rights.note	有償授權
dc.date.accepted	2019-08-11
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
Appears in Collections:	資訊管理學系

Files in This Item:

File	Size	Format
ntu-108-1.pdf Restricted Access	966.5 kB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets