請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66929
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 徐宏民 | |
dc.contributor.author | Ya-Ting Lin | en |
dc.contributor.author | 林雅婷 | zh_TW |
dc.date.accessioned | 2021-06-17T01:15:02Z | - |
dc.date.available | 2022-08-24 | |
dc.date.copyright | 2017-08-24 | |
dc.date.issued | 2017 | |
dc.date.submitted | 2017-08-14 | |
dc.identifier.citation | [1] L. Zhuang, F. Jing, and X.-Y. Zhu, “Movie review mining and summarization,” in Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 43–50, ACM, 2006.
[2] B. Pang, L. Lee, et al., “Opinion mining and sentiment analysis,” Foundations and Trends® in Information Retrieval, vol. 2, no. 1–2, pp. 1–135, 2008. [3] R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, C. Potts, et al., “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol. 1631, p. 1642, 2013. [4] D. Tang, B. Qin, and T. Liu, “Document modeling with gated recurrent neural network for sentiment classification.,” in EMNLP, pp. 1422–1432, 2015. [5] H. Wang, Y. Lu, and C. Zhai, “Latent aspect rating analysis on review text data: a rating regression approach,” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 783–792, ACM, 2010. [6] K. Cho, B. van Merriënboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder–decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (Doha, Qatar), pp. 1724–1734, Association for Computational Linguistics, Oct. 2014. [7] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. [8] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156–3164, 2015. [9] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” in International Conference on Machine Learning, pp. 2048–2057, 2015. [10] Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo, “Image captioning with semantic attention,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4651–4659, 2016. [11] S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K. Saenko, “Sequence to sequence-video to text,” in Proceedings of the IEEE international conference on computer vision, pp. 4534–4542, 2015. [12] T.-H. K. Huang, F. Ferraro, N. Mostafazadeh, I. Misra, A. Agrawal, J. Devlin, R. Girshick, X. He, P. Kohli, D. Batra, C. L. Zitnick, D. Parikh, L. Vanderwende, M. Galley, and M. Mitchell, “Visual storytelling,” June 2016. [13] Y. Liu, J. Fu, T. Mei, and C. W. Chen, “Let your photos talk: Generating narrative paragraph for photo stream via bidirectional attention recurrent neural networks.,” in AAAI, pp. 1445–1452, 2017. [14] C. C. Park and G. Kim, “Expressing an image stream with a sequence of natural sentences,” in Advances in Neural Information Processing Systems, pp. 73–81, 2015. [15] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015. [16] L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101 – mining discriminative components with random forests,” in European Conference on Computer Vision, 2014. [17] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Advances in Neural Information Processing Systems (NIPS), 2015. [18] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “Liblinear: A library for large linear classification,” Journal of machine learning research, vol. 9, no. Aug, pp. 1871–1874, 2008. [19] B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” science, vol. 315, no. 5814, pp. 972–976, 2007. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66929 | - |
dc.description.abstract | 由於近年來評論資訊的重要性,如何提供重要且富含資訊量的評論資訊是一個相當重要的課題。為此,我們提出了一個結構性評論的想法,也就是將一段評論搭配上相對應的影像來提供更為資訊量豐富的評論,並蒐集了大量的資料集。然而,這些文字評論跟影像資料都是相當雜的,因此濾掉相對不重要的資訊也是一個不可缺的步驟。此外,我們利用聚類分析而非分類方法來分類不同食物,因為現今所可取得的資料集並沒有適合我們問題的資料來做分類。濾掉不重要的資訊後,我們提出一個利用擬似配對資料做兩階段訓練的方法來避免跨域問題並使用不同的融合方法解決以多張影像為輸入的問題。透過我們的方法,產生的評論會相對穩定且合理並於食物正確性方面有相對36%的進步。同時,我們利用評估文字品質的BLEU來評估我們的方法也有相對的進步。 | zh_TW |
dc.description.abstract | Due to the importance of review information recently, how to provide more important and informative reviews to users is an essential problem to resolve. Therefore, we propose a novel idea named structural review which aims to match the review with corresponding images for more informative information and collect a large dataset. However, images are noisy even the text reviews, so it is also an essential process to filter the relatively useless information. Besides, we use clustering method to cluster the images with the same food type rather than classification for there is no suitable food dataset for our task to train a classifier. After filtering the noises, we propose a two-stage training method with pseudo pairs to avoid cross-domain issue and utilize different fusion methods for the input with multiple images. With our method, the quality of generated reviews is more stable and it also performs better with food accuracy with about 36% relative improvement. Meanwhile, our method also performs better with BLEU metric which measures the quality of the text. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T01:15:02Z (GMT). No. of bitstreams: 1 ntu-106-R04944009-1.pdf: 42237744 bytes, checksum: ecb5d7567a33f22f322ebb178f76b0aa (MD5) Previous issue date: 2017 | en |
dc.description.tableofcontents | 摘要 iii
Abstract iv 1 Introduction 1 1.1 Importance of reviews . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Challenges of structural reviews . . . . . . . . . . . . . . . . . . . . . 2 1.3 Dataset and proposed method . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Related Work 5 2.1 Review related research . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Single image captioning . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Multiple images captioning . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Cross-Media Dataset 7 3.1 Review data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Image data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 Combine cross-media data . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Proposed Method 10 4.1 Input images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.1.1 Filtering images with people . . . . . . . . . . . . . . . . . . . 10 4.1.2 Filtering non-food images . . . . . . . . . . . . . . . . . . . . 11 4.2 Target reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.3 First training stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3.1 Testing on multiple images . . . . . . . . . . . . . . . . . . . 13 4.4 Second training stage . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5 Experiments 16 5.1 Standard measurements . . . . . . . . . . . . . . . . . . . . . . . . . 16 5.2 Supporting evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.3 Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 6 Conclusions and Future Work 22 Bibliography 23 | |
dc.language.iso | en | |
dc.title | 利用跨多媒體擬似配對資料於影像群生成食物評論 | zh_TW |
dc.title | Food Review Generation for a Set of Images by Leveraging Cross-Media Pseudo Pairs | en |
dc.type | Thesis | |
dc.date.schoolyear | 105-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 陳文進,李國徵 | |
dc.subject.keyword | 評論,兩階段訓練,擬似配對, | zh_TW |
dc.subject.keyword | Review,Two-stage Training,Pseudo Pair, | en |
dc.relation.page | 25 | |
dc.identifier.doi | 10.6342/NTU201703180 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2017-08-15 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
顯示於系所單位: | 資訊網路與多媒體研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-106-1.pdf 目前未授權公開取用 | 41.25 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。