請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74032完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 王鈺強(Yu-Chiang Wang) | |
| dc.contributor.author | Yen-Ting Liu | en |
| dc.contributor.author | 劉彥廷 | zh_TW |
| dc.date.accessioned | 2021-06-17T08:17:26Z | - |
| dc.date.available | 2019-08-20 | |
| dc.date.copyright | 2019-08-20 | |
| dc.date.issued | 2019 | |
| dc.date.submitted | 2019-08-14 | |
| dc.identifier.citation | [1] W.-L. Chao, B. Gong, K. Grauman, and F. Sha, “Large-margin determinantal point processes.” in UAI, 2015, pp. 191–200. 1
[2] M. Gygli, H. Grabner, H. Riemenschneider, and L. Van Gool, “Creating summaries from user videos,” in European conference on computer vision. Springer, 2014, pp. 505–520. 1, 17 [3] M. Gygli, H. Grabner, and L. Van Gool, “Video summarization by learning submodular mixtures of objectives,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3090–3098. 1 [4] K. Zhang, W.-L. Chao, F. Sha, and K. Grauman, “Summary transfer: Exemplar-based subset selection for video summarization,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 1059–1067. 1 [5] ——, “Video summarization with long short-term memory,” in Proceedings of European Conference on Computer Vision (ECCV), 2016. 1, 5, 18, 19, 20, 25, 26 [6] Y. Jung, D. Cho, D. Kim, S. Woo, and I. S. Kweon,“Discriminative feature learning for unsupervised video summarization,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2018. 1, 5, 20 [7] B. Mahasseni, M. Lam, and S. Todorovic, “Unsupervised video summarization with adversarial lstm networks,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 1, 5, 6, 18, 19, 20, 25, 26 [8] K. Zhang, K. Grauman, and F. Sha, “Retrospective encoders for video summarization,” in Proceedings of European Conference on Computer Vision (ECCV), 2018. 1, 5, 6, 25 [9] K. Zhou, Y. Qiao, and T. Xiang, “Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2018. 1, 5, 14, 17, 18, 19, 20, 25, 26 [10] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. 1, 5 [11] J. Yue-Hei Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici, “Beyond short snippets: Deep networks for video classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 4694–4702. 1 [12] B. Zhao, X. Li, and X. Lu, “Hierarchical recurrent neural network for video summarization,” in Proceedings of the 25th ACM international conference on Multimedia. ACM, 2017, pp. 863–871. 1, 5, 19, 25 [13] ——, “Hsa-rnn: Hierarchical structure-adaptive rnn for video summarization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7405–7414. 1, 5, 19, 25 [14] M. Rochan, L. Ye, and Y. Wang, “Video summarization using fully convolutional sequence networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 347–363. 1, 5, 18, 19, 20, 25, 26 [15] T.-J. Fu, S.-H. Tai, and H.-T. Chen, “Attentive and adversarial learning for video summarization,” in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019, pp. 1579–1587. 2, 6 [16] Z. Ji, K. Xiong, Y. Pang, and X. Li, “Video summarization with attentionbased encoder-decoder networks,” arXiv preprint arXiv:1708.09545, 2017. 2, 6, 18, 19, 25 [17] Z. Ji, S. Hajar Sadeghi, A. Vasileios, M. Dorothy, and R. Paolo, “Summarizing videos with attention,” Proceedings of the AAAI Conference on Artificial Intelligence Workshops (AAAI workshops), 2018. 2, 6, 19, 25, 26 [18] O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Advances in Neural Information Processing Systems, 2015, pp. 2692–2700. 6 [19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008. 7 [20] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” in International conference on machine learning, 2015, pp. 2048–2057. 9 [21] Y. Song, J. Vallmitjana, A. Stent, and A. Jaimes, “Tvsum: Summarizing web videos using titles,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5179–5187. 17, 26 [22] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9. 19 [23] B. Zhao and E. P. Xing, “Quasi real-time summarization for consumer videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2513–2520. 26 [24] E. R. J. H. Mayu Otani, Yuta Nakashima, “Rethinking the evaluation of video summaries,” in CVPR, 2019. 22, 26 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74032 | - |
| dc.description.abstract | 視頻摘要主要是從一部影片藉由挑選出真正重要的片段來縮短影片長度,到目前為止,視頻摘要仍然是一項在電腦視覺領域中值得研究的題目。在本篇論文中,我們提出了一個新的架構試圖去解決包含各式內容的影片。我們提出的多樣化專注層面的視頻摘要模型。 | zh_TW |
| dc.description.abstract | Video summarization is among challenging tasks in computer vision, which aims at identifying highlight frames or shots over lengthy video inputs. In this paper, we propose an attention-based model for video summarization and to handle complex video data. A novel deep learning the framework of multi-head multi-layer video self-attention (M2VSA) is presented to identify informative regions across spatial and temporal video features, which jointly exploit context diversity over space and time for summarization purposes. Together with visual concept consistency enforced in our framework, both video recovery and summarization can be preserved. More importantly, our developed model can be realized in both supervised/unsupervised settings. Finally, our experiments quantitative and qualitative results demonstrate the effectiveness of our model and our superiority over state-of-the-art approaches. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-17T08:17:26Z (GMT). No. of bitstreams: 1 ntu-108-R06942114-1.pdf: 7705935 bytes, checksum: 6ad12a123c4a31dda04f5952cafca982 (MD5) Previous issue date: 2019 | en |
| dc.description.tableofcontents | Abstract i
List of Figures v List of Tables vii 1 Introduction 1 2 RelatedWork 5 3 Method 7 3.1 M2VSA for Video Spatial-Temporal Attention . . . . . . . . . . . 9 3.2 Unlabeled Video Summarization with Visual Concept Consistency 12 3.3 From Supervised Towards Unsupervised Learning for Video Summarization 13 4 Experiment 17 4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Protocols and Implementation Details . . . . . . . . . . . . . . . 18 4.3 Quantitative Comparisons . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 Comparison with supervised approaches . . . . . . . . . . 19 4.3.2 Comparisons with unsupervised approaches . . . . . . . . 20 4.3.3 Analysis for semi-supervised settings . . . . . . . . . . . 20 4.3.4 Performance comparisons and ablation studies. . . . . . . 22 4.4 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Conclusion 27 Reference 29 | |
| dc.language.iso | en | |
| dc.subject | 電腦視覺 | zh_TW |
| dc.subject | 視頻摘要 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | Computer Vision | en |
| dc.subject | Deep Learning | en |
| dc.subject | Video Summarization | en |
| dc.title | 藉由視覺注意力來處理視頻摘要 | zh_TW |
| dc.title | Transforming Visual Attention into Video Summarization | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 107-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 邱維辰(Wei-Chen Chiu),林彥宇(Yen-Yu Lin) | |
| dc.subject.keyword | 視頻摘要,深度學習,電腦視覺, | zh_TW |
| dc.subject.keyword | Computer Vision,Deep Learning,Video Summarization, | en |
| dc.relation.page | 32 | |
| dc.identifier.doi | 10.6342/NTU201901661 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2019-08-14 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-108-1.pdf 未授權公開取用 | 7.53 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
