Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/16181
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor徐宏民(Winston H. Hsu)
dc.contributor.authorGuan-Long Wuen
dc.contributor.author吳冠龍zh_TW
dc.date.accessioned2021-06-07T18:04:05Z-
dc.date.copyright2012-08-01
dc.date.issued2012
dc.date.submitted2012-07-29
dc.identifier.citation[1] D. Achlioptas. Database-friendly random projections: Johnson-lindenstrauss with binary coins. J. Comput. Syst. Sci., pages 671–687, 2003.
[2] D.M.Chen,N.-M.Cheung,S.S.Tsai,V.Chandrasekhar,G.Takacs,R.Vedantham, R. Grzeszczuk, and B. Girod. Dynamic selection of a feature-rich query frame for mobile video retrieval. In ICIP, pages 1017–1020, 2010.
[3] C. Fellbaum. WordNet: An Electronical Lexical Database. The MIT Press, Cam- bridge, MA, 1998.
[4] E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proceedings of The Twentieth Inter- national Joint Conference for Artificial Intelligence, pages 1606–1611, Hyderabad, India, 2007.
[5] B. Girod, V. Chandrasekhar, D. M. Chen, N.-M. Cheung, R. Grzeszczuk, Y. Reznik, G. Takacs, S. S. Tsai, and R. Vedantham. Mobile visual search. IEEE Signal Pro- cessing Magazine, 2011.
[6] C. Havasi, R. Speer, and J. Alonso. Conceptnet 3: a flexible, multilingual semantic network for common sense knowledge. In Recent Advances in Natural Language Processing, September 2007.
[7] J. He, T.-H. Lin, J. Feng, and S.-F. Chang. Mobile product search with bag of hash bits. In Proceedings of the 19th ACM international conference on Multimedia, MM ’11, pages 839–840, 2011.
[8] J. Hoffart, F. Suchanek, K. Berberich, E. Kelham, G. de Melo, G. Weikum, F. Suchanek, G. Kasneci, M. Ramanath, and A. Pease. YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia. Communications of the ACM, 52(4):56–64, 2009.
[9] R. Hong, G. Li, L. Nie, J. Tang, and T.-S. Chua. Exploring large scale data for multimedia qa: an initial study. In Proceedings of the ACM International Conference on Image and Video Retrieval, CIVR ’10, pages 74–81, 2010.
[10] H. Je ́gou, M. Douze, C. Schmid, and P. Pe ́rez. Aggregating local descriptors into a compact image representation. In IEEE Conference on Computer Vision & Pattern Recognition, pages 3304–3311, jun 2010.
[11] M. Journe ́e, Y. Nesterov, P. Richta ́rik, and R. Sepulchre. Generalized power method for sparse principal component analysis. J. Mach. Learn. Res., 11:517–553, Mar. 2010.
[12] M. Sahami and T. D. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In Proceedings of the 15th international conference on World Wide Web, WWW ’06, pages 377–386, 2006.
[13] J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In Proceedings of the 19th ACM international conference on Multimedia, MM ’11, pages 423–432, 2011.
[14] P. Sorg. research-esa - an implementation of explicit semantic analysis for research. http://code.google.com/p/research-esa/.
[15] Y.-C. Su, G.-L. Wu, T.-H. Chiu, W. H. Hsu, and K.-W. Chang. Evaluating gaussian like image representations over local features. In ICME, 2012.
[16] J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning for hashing with compact codes. In International Conference on Machine Learning (ICML), Haifa, Israel, June 2010.
[17] L. Xie, A. Natsev, J. R. Kender, M. Hill, and J. R. Smith. Visual memes in social media: tracking real-world news in youtube videos. In Proceedings of the 19th ACM international conference on Multimedia, MM ’11, pages 53–62, 2011.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/16181-
dc.description.abstractRetrieving relevant videos from a large corpus is a long-standing research problem, and doing so on mobile devices brings additional technical challenges. This paper addresses two key issues for mobile applications on user-generated videos. The first is the lack of good relevance measurement, due to the unconstrained nature of online videos. The second is the strict requirement on efficiency, due to the limited resource on mobile device, stringent bandwidth and delay requirement between the device and the video server. We propose two novel approaches for each problem. We carry out Pseudo Label Mining based on Explicit Semantic Analysis to generate high quality similar video pairs. This method connects the video metadata with Wikipedia semantics, and alleviates the need for expensive annotated data. In addition, we propose a novel sparse projection method to address the efficiency challenge. It learns a discriminative compact representation that drastically reduces transmission cost. With less than 10% non-zero element in projection matrix, it also reduces computational and storage cost. The experimental results on 100k videos show that our proposed algorithm is competitive in the MAP performance to the state-of-the-art semi-supervised hashing method which is not applicable on mobile platforms. The average query time on 100k videos consumes only 0.592 seconds.en
dc.description.provenanceMade available in DSpace on 2021-06-07T18:04:05Z (GMT). No. of bitstreams: 1
ntu-101-R99944013-1.pdf: 1753069 bytes, checksum: a21d7a78b0b946d69e710ea19629e8de (MD5)
Previous issue date: 2012
en
dc.description.tableofcontents口試委員會審定書 i
致謝 ii
中文摘要 iii
Abstract iv
1 Introduction 1
2 Related Work 3
3 Sparse Projection Learning (SHP) 5
3.1 ProblemFormulation ............................ 5
3.2 AlgorithmDesign.............................. 6
3.3 ApplicationinMobile-basedVideoRetrieval . . . . . . . . . . . . . . . 7
4 Pseudo Label Mining 9
4.1 ExplicitSemanticAnalysis(ESA) ..................... 9
4.2 Web-basedKernelFunction(WKF) .................... 10
4.3 PseudoLabelMiningBasedonContextDataofVideos . . . . . . . . . . 11
5 Experiments and Discussion 12
5.1 Datasets................................... 12
5.2 ExperimentSettings............................. 13
5.3 Retrieval Performance Comparison on NUS-WEBV dataset using Human
Annotation ................................. 15
5.4 Precision of Possible Semantic-related Video Selection of Explicit Se-
manticAnalysisandWeb-basedKernelFunction . . . . . . . . . . . . . 16
5.5 Sparse Projection Learning based on Pseudo Label Mining . . . . . . . . 17
5.6 Efficiency.................................. 21
6 Conclusion 22
Bibliography 24
dc.language.isoen
dc.title稀疏雜湊學習法與語意標註探勘應用於行動裝置上之大規模影片搜尋zh_TW
dc.titleLarge-scale Mobile-based Video Retrieval with Sparse Projection Learning and Pseudo Label Miningen
dc.typeThesis
dc.date.schoolyear100-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳良弼,劉庭祿,林軒田
dc.subject.keyword半監督式雜湊法,稀疏,行動裝置,大規模影片搜尋,外顯語意分析,zh_TW
dc.subject.keywordsemi-supervised hashing,sparsity,mobile-based video retrieval,large-scale,explicit semantic analysis,en
dc.relation.page26
dc.rights.note未授權
dc.date.accepted2012-07-30
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-101-1.pdf
  目前未授權公開取用
1.71 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved