請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3944
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 徐宏民 | |
dc.contributor.author | Wen-Yu Lee | en |
dc.contributor.author | 李文瑜 | zh_TW |
dc.date.accessioned | 2021-05-13T08:38:59Z | - |
dc.date.available | 2019-06-11 | |
dc.date.available | 2021-05-13T08:38:59Z | - |
dc.date.copyright | 2016-06-11 | |
dc.date.issued | 2016 | |
dc.date.submitted | 2016-04-27 | |
dc.identifier.citation | [1] Flickr. http://www.flickr.com/.
[2] Foursquare. https://foursquare.com/. [3] Google maps. https://maps.google.com/. [4] Instagram API endpoints. http://instagram.com/developer/endpoints/media/. [5] Yahoo! travel. http://travel.yahoo.com/. [6] Z. Al Bawab, G. H. Mills, and J.-F. Crespo. Finding trending local topics in search queries for personalization of a recommendation system. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 397-405, August 2012. [7] Y. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving landmark and non-landmark images from community photo collection. In Proceedings of ACM International Conference on Multimedia, pages 153-162, October 2010. [8] H. Becker, D. Iter, M. Naaman, and L. Gravano. Identifying content for planned events across social media sites. In Proceedings of ACM International Conference on Web Search and Data Mining, pages 533-542, February 2012. [9] A. Z. Broder. On the resemblance and containment of documents. In Proceedings of IEEE Compression and Complexity of Sequences, pages 21-29, June 1997. [10] X. Chang, Y. Yang, A. G. Hauptmann, E. P. Xing, and Y.-L. Yu. Semantic concept discovery for large-scale zero-shot event detection. In Proceedings of IJCAI International Joint Conference on Artificial Intelligence, pages 2234-2240, July 2015. [11] D. M. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk. City-scale landmark identification on mobile devices. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 737-744, June 2011. [12] A. Coates and A. Y. Ng. The importance of encoding versus training with sparse coding and vector quantization. In Proceedings of IMLS International Conference on Machine Learning, pages 921-928, June-July 2011. [13] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2009. [14] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. ACM Communications of the ACM, 51(1):107-113, January 2008. [15] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. IMS The Annals of Statistics, 32(2):407-499, April 2004. [16] Facebook. Form 10-K (annual report)-filed 02/01/13 for the period ending 12/31/12, 2013. [17] L. Feng, J. Wu, S. Liu, and H. Zhang. Global correlation descriptor: a novel image representation for image retrieval. Elsevier Journal of Visual Communication and Image Representation, 33(1):104-114, November 2015. [18] T. Fujisaka, R. Lee, and K. Sumiya. Discovery of user behavior patterns from geo-tagged micro-blogs. In Proceedings of ACM International Conference on Ubiquitous Information Management and Communication, pages 246-255, January 2010. [19] J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In Proceedings of ACM International Conference on Multimedia, pages 9-16, October 2004. [20] K.-H. Ho, H.-C. Ou, Y.-W. Chang, and H.-F. Tsao. Coupling-aware length-ratio-matching routing for capacitor arrays in analog integrated circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 34(2):161-172, February 2015. [21] S.-L. Huang, C.-A. Wu, K.-F. Tang, C.-H. Hsu, and C.-Y. R. Huang. A robust ECO engine by resource-constraint-aware technology mapping and incremental routing optimization. In Proceedings of IEEE/ACM Asia and South Pacific Design Automation Conference, pages 382-387, January 2011. [22] Y.-G. Jiang, J. Wang, and S.-F. Chang. Lost in binarization: query-adaptive ranking for similar image search with compact codes. In Proceedings of ACM International Conference on Multimedia Retrieval, pages 1-8, April 2011. [23] U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: mining peta-scale graphs. Springer Knowledge and Information Systems, 27(2):303-325, May 2011. [24] G. Karakostas. A better approximation ratio for the vertex cover problem. ACM Transactions on Algorithms, 5(4):41:1-41:8, October 2009. [25] L. Kennedy and M. Naaman. Generating diverse and representative image search results for landmarks. In Proceedings of ACM International Conference on World Wide Web, pages 297-306, April 2008. [26] Y.-H. Kuo, K.-T. Chen, C.-H. Chiang, and W. H. Hsu. Query expansion for hash-based image object retrieval. In Proceedings of ACM International Conference on Multimedia, pages 65-74, October 2009. [27] Y.-H. Kuo, Y.-Y. Chen, B.-C. Chen, W.-Y. Lee, C.-C. Wu, C.-H. Lin, Y.-L. Hou, W.-F. Cheng, Y.-C. Tsai, C.-Y. Hung, L.-C. Hsieh, and W. H. Hsu. Discovering the city by mining diverse and multimodal data streams. In Proceedings of ACM International Conference on Multimedia, pages 201-204, November 2014. [28] Y.-H. Kuo, W.-Y. Lee, W. H. Hsu, and W.-H. Cheng. Augmenting mobile city-view image retrieval with context-rich user-contributed photos. In Proceedings of ACM International Conference on Multimedia, pages 687-690, November-December 2011. [29] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In Proceedings of NIPS Foundation Advances in Neural Information Processing Systems, pages 801-808, December 2006. [30] H. Liu, T. Mei, J. Luo, H. Li, and S. Li. Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing. In Proceedings of ACM International Conference on Multimedia, pages 9-18, October-November 2012. [31] J. Liu, Z. Huang, L. Chen, H. T. Shen, and Z. Yan. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of ACM International Conference on Multimedia, pages 589-598, October-November 2012. [32] W. Liu, J. He, and S.-F. Chang. Large graph construction for scalable semi-supervised learning. In Proceedings of IMLS International Conference on Machine Learning, pages 679-686, June 2010. [33] Y. Liu, T. Mei, X. Wu, and X.-S. Hua. Multigraph-based query-independent learning for video search. IEEE Transactions on Circuits and Systems for Video Technology, 19(12):1841-1850, December 2009. [34] J. Long, H. Zhou, and S. O. Memik. An O(nlogn) edge-based algorithm for obstacle-avoiding rectilinear Steiner tree construction. In Proceedings of ACM International Symposium on Physical Design, pages 126-133, April 2008. [35] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of IMLS International Conference on Machine Learning, pages 689-696, June 2009. [36] P. Meladianos, G. Nikolentzos, F. Rousseau, Y. Stavrakas, and M. Vazirgiannis. Degeneracy-based real-time sub-event detection in Twitter stream. In Proceedings of AAAI International Conference on Web and Social Media, pages 248-257, May 2015. [37] J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Automatic multimedia cross-modal correlation discovery. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 653-658, August 2004. [38] B. Quanz, J. Huan, and M. Mishra. Knowledge transfer with low-quality data: a feature extraction issue. IEEE Transactions on Knowledge and Data Engineering, 24(10):1789-1802, October 2012. [39] D. Rao and D. Yarowsky. Ranking and semi-supervised classification on large scale graphs using Map-Reduce. In Proceedings of ACM Workshop on Graph-Based Methods for Natural Language Processing, pages 58-65, August 2009. [40] M. Rischka and S. Conrad. Image landmark recognition with hierarchical k-means tree. In Proceedings of GI Database Systems for Business, Technology, and Web, pages 455-464, March 2015. [41] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of ACM International Conference on World Wide Web, pages 851-860, April 2010. [42] J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. TwitterStand: news in tweets. In Proceedings of ACM International Conference on Advances in Geographic Information Systems, pages 42-51, November 2009. [43] B. Shaw, J. Shea, S. Sinha, and A. Hogue. Learning to rank for spatiotemporal search. In Proceedings of ACM International Conference on Web Search and Data Mining, pages 717-726, February 2013. [44] S. Theodoridis and K. Koutroumbas. Pattern Recognition. Academic Press, 2008. [45] H. Tong, J. He, M. Li, W.-Y. Ma, H.-J. Zhang, and C. Zhang. Manifold-ranking-based keyword propagation for image retrieval. EURASIP Journal on Advances in Signal Processing, 2006(1):1-10, January 2006. [46] H. Tong, J. He, M. Li, C. Zhang, and W.-Y. Ma. Graph based multi-modality learning. In Proceedings of ACM International Conference on Multimedia, pages 862-871, November 2005. [47] V. V. Vazirani. Approximation Algorithms. Springer-Verlag, 2001. [48] M.Wang, X.-S. Hua, X. Yuan, Y. Song, and L.-R. Dai. Optimizing multi-graph learning: towards a unified video annotation scheme. In Proceedings of ACM International Conference on Multimedia, pages 862-871, September 2007. [49] J. Weng, Y. Yao, E. Leonardi, and B.-S. Lee. Event detection in Twitter. Technical report, HP Laboratories, 2011. [50] C.-C. Wu, T. Mei, W. H. Hsu, and Y. Rui. Learning to personalize trending image search suggestion. In Proceedings of ACM International Conference on Research and Development in Information Retrieval, pages 727-736, July 2014. [51] F. Wu, Y. Han, Q. Tian, and Y. Zhuang. Multi-label boosting for image annotation by structural grouping sparsity. In Proceedings of ACM International Conference on Multimedia, pages 15-24, October 2010. [52] Y. Yan, Y. Yang, H. Shen, D. Meng, G. Liu, A. Hauptmann, and N. Sebe. Complex event detection via event oriented dictionary learning. In Proceedings of AAAI Conference on Artificial Intelligence, pages 3841-3847, January 2015. [53] Y.-H. Yang, P.-T. Wu, C.-W. Lee, K.-H. Lin, W. H. Hsu, and H. Chen. ContextSeer: context search and recommendation at query time for shared consumer photos. In Proceedings of ACM International Conference on Multimedia, pages 199-208, October 2008. [54] A. C.-C. Yao. On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(4):721-736, November 1982. [55] P. A. Zandbergen and S. J. Barbeau. Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones. RIN Journal of Navigation, 64(3):381-399, July 2011. [56] W. Zhang, G. Qi, G. Pan, H. Lu, S. Li, and Z. Wu. City-scale social event detection and evaluation with taxi traces. ACM Transactions on Intelligent Systems and Technology, 6(3):40:1-40:20, May 2015. [57] D. Zhou, Q. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Proceedings of NIPS Foundation Advances in Neural Information Processing Systems, pages 321-328, December 2003. [58] H. Zhou, N. Shenoy, and W. Nicholls. Efficient minimum spanning tree construction without Delaunay triangulation. Elsevier Information Processing Letters, 81(5):271-276, February 2002. [59] G. Zhu, S. Yan, and Y. Ma. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of ACM International Conference on Multimedia, pages 461-470, October 2010. [60] X. Zhu. Semi-supervised learning literature survey. Doctoral Dissertation, Carnegie Mellon University, 2006. [61] X. Zhu. Semi-supervied learning literature survey. Technical report, University of Wisconsin Madison, 2008. [62] X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of IMLS International Conference on Machine Learning, pages 912-919, August 2003. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3944 | - |
dc.description.abstract | 近年來,社群網路的蓬勃發展逐漸地改變了人們的生活,越來越多人習慣在社群網站上分享生活的點滴;據統計,平均每一天有數百萬的多媒體資料,例如照片,被上傳至社群網站,而這樣龐大數目的多媒體資料裡極可能蘊藏著豐富的資訊;在這篇論文裡,我們期望能從大量的多媒體資料裡萃取出有用的資訊,並藉此解決人們日常生活裡常見的一些問題。
從一張照片出發,我們期望建立一套資訊系統,可以找出這張照片是在哪裡拍攝的,其他人可能會對這照片給予怎樣的文字標籤來描述這拍攝的內容,又這張照片的拍攝地點附近是否有熱門的活動或事件;這樣的資訊,在日常生活裡可以有許多應用,例如在旅遊的時候,我們常會想知道自己在哪裡、眼前的事物是什麼,又或者在這附近平常會有什麼樣的活動或事件;這時只要隨手地對眼前事物拍一張照,再將這照片送入此資訊系統,系統就可以快速地提供我們這些資訊。 過去找出照片拍攝地點的常見作法是利用照片的視覺特徵與地理標籤,透過與現有的資料比對,以推測出照片最可能的拍攝地點;但當照片是在室內被拍攝或是拍攝地點有多個建築物時,可能會降低這類方法的準確度;為此,此論文進一步地加入了打卡資料來輔助定位,同時提出一個有效的影像重新分群技術,來提升推測的照片拍攝地點的準確度。而對於找出照片就拍攝的內容其一般會被加註什麼樣的文字標籤,常見的技巧是利用已有文字標籤的現有照片集來預測那沒有標籤的照片其可能的文字標籤;一種有效作法是根據照片與照片之間的視覺相似度來建構圖,這圖是以照片作為節點,相似度作為連接節點與節點之間的邊的權重,之後考慮到邊的權重,將標籤自節點傳遞分享至沒有標籤的照片的節點,這種作法即使是當沒有標籤的照片數目大於有標籤的照片數目仍經常有效;而考慮到過去的方法大多是一次只傳遞單一個標籤,對於多個標籤的情況下則需要進行多次,並且現有照片的數目一直以來持續地在增長,方法的效率逐漸成為一個重要的議題;有鑑於此,此論文提出了一個基於分散式運算原理的多標籤傳遞法,來有效率地同時傳遞多個標籤。對於找出熱門活動,過往的方法多專注於觀察單一的多媒體資料集的資料變化,例如字詞的談論頻率變化,異常的變化則表示可能有活動或事件發生;而此論文考慮到不同資料集其性質都有所不同,彼此之間可能可以互補不足的地方,來提出了一個有效的兩階段式架構,能有效地結合資料流類型的資料集及打卡類型的資料集,以找出熱門的活動或事件的發生時間、發生地點及其內容資訊;由於已能找出照片可能的拍攝地點,則可以透過比較可能的拍攝地點和熱門活動或事件的發生地點來找出照片拍攝地點附近的熱門活動或事件。 此論文所提的方法都已透過實驗分析來顯示這些方法的有效性。最後,此論文提出一些未來可能的研究方向。 | zh_TW |
dc.description.abstract | Social media have changed the world and our lives. Every day, millions of media data are uploaded to social-sharing websites. The goal of the research is to discover and summarize large amounts of media data from the emerging social media into information of interests. Our basic idea is to perform multi-modal learning for given data, leveraging user-contributed data from cross-domain social media. Specifically, given a photo, we intend to discover geographical information, people's description or comments, and events of interest, closely related to the photo. These information then can be used for various purposes, such as being a real-time guide for the tourists to improve the quality of tourism. As a result, this dissertation studies modern challenges of image location identification, image annotation, and event discovery, followed by presenting promising ways to conquer the challenges.
For image location identification, most previous works directly integrated visual features and geo-tags of the given photos. The performance of the existing approaches, however, could be limited if the given photos were taken indoors, and/or their image contents contain a number of buildings in a close proximity. As a solution, this dissertation unifies visual features, geo-tags, and check-in data, and further presents an image cluster refinement approach, for image location identification. For image annotation, label propagation is widely used to annotate photos based on similarity graphs of photos, where most previous works focused on single-label propagation. Although performing multi-label propagation is expected to be more efficient for annotation than performing single-label propagation several times, performing multi-label propagation may increase the computational complexities. Further, sizes of image datasets continue to increase and thus increase the problem complexity. As a solution, this dissertation presents a scalable multi-label propagation leveraging the power of distributed computing. For event discovery, most previous works investigated a specific media stream. Potentially, mining multiple media streams is capable of achieving better performance than mining a media stream alone, but could be more challenging. As a solution, this dissertation presents a two-stage framework that combines a flow-based media dataset and check-in-based media dataset for events-of-interest discovery. Experimental results on real media datasets show the effectiveness of all of the proposed approaches. Finally, this dissertation provides some possible directions for future studies. | en |
dc.description.provenance | Made available in DSpace on 2021-05-13T08:38:59Z (GMT). No. of bitstreams: 1 ntu-105-D99944007-1.pdf: 2391682 bytes, checksum: e9470bd6333e73cec6047bd295ed37e1 (MD5) Previous issue date: 2016 | en |
dc.description.tableofcontents | Acknowledgements v
Abstract (Chinese) vii Abstract ix List of Tables xv List of Figures xvii Chapter 1. Introduction 1 1.1 Modern Challenges 3 1.1.1 Image Location Identification 3 1.1.2 Image Annotation 4 1.1.3 Event Discovery 5 1.2 Overview of the Dissertation 6 1.2.1 Geo-Social Media Mining for Image Location Identification 6 1.2.2 Graph-based Semi-Supervised Learning for Image Annotation 7 1.2.3 Cross-Domain Media Learning for Event Discovery 7 1.3 Organization of the Dissertation 8 Chapter 2. Geo-Social Media Mining for Image Location Identification 9 2.1 Introduction 9 2.2 City-View Image Location Identification 12 2.2.1 Data Collection 13 2.2.2 Location Identification 17 2.3 Macro-Cluster Regrouping 18 2.3.1 A Naive Approach 19 2.3.2 Regrouping by Proximity 20 2.4 Sparse Coding for Macro Clusters 23 2.4.1 Formulation 23 2.4.2 Dictionary Selection within a Macro Cluster 25 2.5 Experiments 27 2.5.1 Results with Manual-Annotated Ground Truth 29 2.5.2 Results with Automatic-Annotated Ground Truth 33 Chapter 3. Graph-based Semi-Supervised Learning for Image Annotation 35 3.1 Introduction 35 3.2 Learning System Overview 37 3.3 Methodologies 39 3.3.1 Multi-Layer Learning Structure 39 3.3.2 Multi-Layer Multi-label Propagation Algorithm 41 3.3.3 Convergence Criterion of Label Propagation 43 3.3.4 Tag Refinement 45 3.4 Experiments 49 3.4.1 Experimental Setup 49 3.4.2 Evaluation Criteria 50 3.4.3 Multi-Layer Multi-Label Propagation 51 3.4.4 Tag Refinement 51 3.4.5 Convergence Criterion - Shortest Path 52 3.4.6 Sensitivity Test 53 Chapter 4. Cross-Domain Media Learning for Event Discovery 55 4.1 Introduction 55 4.2 Problem Formulation 58 4.3 Proposed Approach 60 4.3.1 Flow Normalization 62 4.3.2 Buzz-Score Calculation 63 4.3.3 Variability-Score Calculation 64 4.3.4 Spanning-Graph Construction 66 4.3.5 Weight Determination 69 4.3.6 Flow Propagation 70 4.4 Experiments 72 4.4.1 Experimental Setup 73 4.4.2 Cross-Dataset Analysis 73 4.4.3 Evaluation of the Proposed Approach 75 Chapter 5. Conclusion and Future Work 79 5.1 Conclusion 79 5.2 Future Work 80 Bibliography 83 | |
dc.language.iso | en | |
dc.title | 以複合式模型學習法探究多社群網路媒體之使用者資訊 | zh_TW |
dc.title | Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media | en |
dc.type | Thesis | |
dc.date.schoolyear | 104-2 | |
dc.description.degree | 博士 | |
dc.contributor.oralexamcommittee | 陳祝嵩,陳信希,陳文進,葉梅珍,余能豪 | |
dc.subject.keyword | 事件挖掘,影像標註,影像地點辦識,複合式模型學習,社群網路媒體,使用者資訊, | zh_TW |
dc.subject.keyword | Event Discovery,Image Annotation,Image Location Identification,Multi-Modal Learning,Social Media,User-Contributed Content, | en |
dc.relation.page | 90 | |
dc.identifier.doi | 10.6342/NTU201600215 | |
dc.rights.note | 同意授權(全球公開) | |
dc.date.accepted | 2016-04-28 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
顯示於系所單位: | 資訊網路與多媒體研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-105-1.pdf | 2.34 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。