以物件與事件為基礎之視訊內容調適架構

Wen-Huang Cheng; 鄭文皇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9784

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	吳家麟
dc.contributor.author	Wen-Huang Cheng	en
dc.contributor.author	鄭文皇	zh_TW
dc.date.accessioned	2021-05-20T20:41:10Z	-
dc.date.available	2008-07-24
dc.date.available	2021-05-20T20:41:10Z	-
dc.date.copyright	2008-07-24
dc.date.issued	2008
dc.date.submitted	2008-07-22
dc.identifier.citation	[ABBH08] Lora Aroyo, Pieter Bellekens, Martin Bjorkman, and Geert-Jan Houben. Semantic-based framework for personalised ambient media. Multimedia Tools and Applications, 36(1-2):71-87, 2008. [AGL03] Gregory D. Abowd, Matthias Gauger, and Andreas Lachenmann. The family video archive: An annotation and browsing environment for home movies. In Proceedings of the International Workshop on Multimedia Information Retrieval (MIR), pages 1-8, 2003. [AWSZ05] Ishfaq Ahmad, Xiaohui Wei, Yu Sun, and Ya-Qin Zhang. Video transcoding: an overview of various techniques and research issues. IEEE Transactions on Multimedia, 7(5):793-804, 2005. [AWW02] Aya Aner-Wolf and Lior Wolf. Video de-abstraction or how to save money on your wedding video. In Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV), pages 264-268, 2002. [AYK06] Radhakrishna S.V. Achanta, Wei-Qi Yan, and Mohan S. Kankanhalli. Modeling intent for home video repurposing. IEEE Multimedia, 13(1):46-55, 2006. [BdWH+03] Ian Burnett, Rik Van de Walle, Keith Hill, Jan Bormans, and Fernando Pereira. MPEG-21: goals and achievements. IEEE Multimedia, 10(4):60-70, 2003. [BGP03] Jan Bormans, Jean Gelissen, and Andrew Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2):53-62, 2003. [Bis06] Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006. [BKOK04] Noboru Babaguchi, Yoshihiko Kawai, Takehiro Ogura, and Tadahiro Kitahashi. Personalized abstraction of broadcasted american football video by highlight selection. IEEE Transactions on Multimedia, 6(4):575-586, 2004. [BMM99] Roberto Brunelli, Ornella Mich, and Carla Maria Modena. A survey on the automatic indexing of video data. Journal of Visual Communication and Image Representation, 10(2):78-112, 1999. [Bow02] Sing T. Bow. Pattern Recognition and Image Preprocessing. Marcel Dekker, 2002. [BPdWK06] Ian Burnett, Fernando Pereira, Rik Van de Walle, and Rob Koenen, editors. The MPEG-21 Book. John Wiley & Sons, 2006. [BT03] David Bordwell and Kristin Thompson. Film Art: An Introduction. McGraw-Hill, 7th edition, 2003. [CAL96] Shun Yan Cheung, Mostafa H. Ammar, and Xue Li. On the use of destination set grouping to improve fairness in multicast video distribution. In Proceedings of IEEE INFOCOM'96, pages 553-560, 1996. [CCSW06] Min Chen, Shu-Ching Chen, Mei-Ling Shyu, and Kasun Wickramaratna. Semantic event detection via multimodal data mining. IEEE Signal Processing Magazine, 23(2):38-46, 2006. [CCW03] Wen-Huang Cheng, Wei-Ta Chu, and Ja-Ling Wu. Semantic context detection based on hierarchical audio models. In Proceedings of the International Workshop on Multimedia Information Retrieval (MIR), pages 109-115, 2003. [CCW05] Wen-Huang Cheng, Wei-Ta Chu, and Ja-Ling Wu. A visual attention based region-of-interest determination framework for video sequences. IEICE Transactions on Information and Systems Journal, (7):1578-1586, 2005. [CEJ+07] Shih-Fu Chang, Dan Ellis,Wei Jiang, Keansub Lee, Akira Yanagawa, Alexander C. Loui, and Jiebo Luo. Large-scale multimodal semantic concept detection for consumer video. In Proceedings of the International Workshop on Multimedia Information Retrieval (MIR), pages 255-264, 2007. [CFGW03] Matthew Cooper, Jonathan Foote, Andreas Girgensohn, and Lynn Wilcox. Temporal event clustering for digital photo collections. In Proceedings of the ACM International Multimedia Conference (MM), pages 364-373, 2003. [CFJ05] Vincent Cheung, Brendan J. Frey, and Nebojsa Jojic. Video epitomes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005. [Cha02] Shih-Fu Chang. Optimal video adaptation and skimming using a utility-based framework. In Proceedings of the International Tyrrhenian Workshop on Digital Communications (IWDC), 2002. [CHL+05] Wen-Huang Cheng, Chun-Wei Hsieh, Sheng-Kai Lin, Chia-Wei Wang, and Ja-Ling Wu. Robust algorithm for exemplar-based image inpainting. In Proceedings of the International Conference on Computer Graphics, Imaging and Vision (CGIV), 2005. [Chu06] Wei-Ta Chu. Semantics-based content analysis and organization in movies and sports videos. PhD Dissertation, Department of Computer Science and Information Engineering, National Taiwan University, 2006. [CLRS01] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, 2nd edition, 2001. [cnn] CNN: http://www.cnn.com/. [CPT04] Antonio Criminisi, Patrick Perez, and Kentaro Toyama. Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing, 13(9):1200-1212, 2004. [CSE05] Andrea Cavallaro, Olivier Steiger, and Touradj Ebrahimi. Semantic video analysis for adaptive content delivery and automatic description. IEEE Transactions on Circuits and Systems for Video Technology, 15(10):1200-1209, 2005. [CSP01] Shih-Fu Chang, Thomas Sikora, and Atul Puri. Overview of the MPEG-7 standard. IEEE Transactions on Circuits and Systems for Video Technology, 11(6):688-695, 2001. [CT06] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley, 2nd edition, 2006. [CV05] Shih-Fu Chang and Anthony Vetro. Video adaptation: Concepts, technologies, and open issues. Proceedings of the IEEE, 93(1):148-2005, 2005. [CV07] Rudi L. Cilibrasi and Paul M.B. Vitanyi. The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3):370-383, 2007. [CWW07] Wen-Huang Cheng, Chia-WeiWang, and Ja-LingWu. Video adaptation for small display based on content recomposition. IEEE Transactions on Circuits and Systems for Video Technology, 17(1):43-58, 2007. [CXF+03] Liqun Chen, Xing Xie, Xin Fan, Wei-Ying Ma, Hong-Jiang Zhang, and Heqin Zhou. A visual attention model for adapting images on small displays. Multimedia Systems Journal, 9(4):353-364, 2003. [Dev95] Jay L. Devore. Probability and Statistics for Engineering and the Sciences. Wadsworth, 4rd edition, 1995. [Dje02] Chabane Djeraba. Content-based multimedia indexing and retrieval. IEEE Multimedia, 9(2):18-22, 2002. [DMK+05] Stamatia Dasiopoulou, Vasileios Mezaris, Ioannis Kompatsiaris, Vasileios-Kyriakos Papastathis, and Michael G. Strintzis. Knowledge-assisted semantic video object detection. IEEE Transactions on Circuits and Systems for Video Technology, 15(10):1210-1224, 2005. [Dra67] Alvin W. Drake. Fundamentals of Applied Probability Theory. Mcgraw-Hill College, 1967. [DRP+06] Javier Diaz, Eduardo Ros, Francisco Pelayo, Eva M. Ortigosa, and Sonia Mota. FPGA-based real-time optical-flow system. IEEE Transactions on Circuits and Systems for Video Technology, 16(2):274-279, 2006. [DV01] Chitra Dorai and Svetha Venkatesh. Computational media aesthetics: finding meaning beautiful. IEEE Multimedia, 8(4):10-12, 2001. [DV03] Chitra Dorai and Svetha Venkatesh. Bridging the semantic gap with computational media aesthetics. IEEE Multimedia, 10(2):15-17, 2003. [EZW97] Stephen Engel, Xuemei Zhang, and Brian Wandell. Colour tuning in human visual cortex measured with functional magnetic resonance imaging. Nature, 388(6637):68-71, 1997. [fac] Facebook: http://www.facebook.com/. [FCL05] Rong-En Fan, Pai-Hsuen Chen, and Chih-Jen Lin. Working set selection using the second order information for training svm. Journal of Machine Learning Research, 6:1889-1918, 2005. [Fel98] Christiane Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, 1998. [FGLJ08] Jianping Fan, Yuli Gao, Hangzai Luo, and Ramesh Jain. Mining multilevel image semantics via hierarchical classification. IEEE Transactions on Multimedia, 10(2):167-187, 2008. [fil] Digital Recomposition System, FlikFX Pty GmbH Ltd., http://www.widescreenmuseum.com/flikfx/. [FLE04] Jianping Fan, Hangzai Luo, and A.K. Elmagarmid. Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing. IEEE Transactions on Image Processing, 13(7):974-992, 2004. [fli] Flickr: http://www.flickr.com/. [Fre04] Linton C. Freeman. The Development of Social Network Analysis: A Study in the Sociology of Science. Empirical Press, 2004. [GKP94] Ronald L. Graham, Donald Knuth, and Oren Patashnik. Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley, 2nd edition, 1994. [GLF06] Yuli Gao, Hangzai Luo, and Jianping Fan. Searching and browsing large scale image database using keywords and ontology. In Proceedings of the ACM International Multimedia Conference (MM), pages 811-812, 2006. [GNTD06] Amit Kumar Gupta, Saeid Nooshabadi, David Taubman, and Michael Dyer. Realizing low-cost high-throughput general-purpose block encoder for JPEG2000. IEEE Transactions on Circuits and Systems for Video Technology, 16(7):843-858, 2006. [GPLS03] Daniel Gatica-Perez, Alexander Loui, and Ming-Ting Sun. Finding structure in home videos by probabilistic hierarchical clustering. IEEE Transactions on Circuits and Systems for Video Technology, 13(6):539-548, 2003. [GT06] Lise Getoor and Ben Taskar. Introduction to Statistical Relational Learning. MIT Press, 2006. [GW01] Rafael C. Gonzalez and Richard E.Woods. Digital Image Processing. Prentice-Hall, 2nd edition, 2001. [HAA97] Youichi Horry, Ken-Ichi Anjyo, and Kiyoshi Arai. Tour into the picture: using a spidery mesh interface to make animation from a single image. In Proceedings of the ACM SIGGRAPH, pages 225-232, 1997. [HCPW03] Chia-Chiang Ho, Wen-Huang Cheng, Ting-Jian Pan, and Ja-Ling Wu. A user-attention based focus detection framework and its applications. In Proceedings of the Pacific-Rim Conference on Multimedia (PCM), 2003. [HCY08] Alexander G. Hauptmann, Michael G. Christel, and Rong Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 96(4):602-622, 2008. [Her07] Luis Herranz. Integrating semantic analysis and scalable video coding for efficient content-based adaptation. Multimedia Systems Journal, 13(2):103-118, 2007. [HLZ04] Xian-Sheng Hua, Lie Lu, and Hong-Jiang Zhang. Optimization based automated home video editing system. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):572-583, May 2004. [Ho03] Chia-Chiang Ho. A study of effective techniques for user oriented video streaming. PhD Dissertation, Department of Computer Science and Information Engineering, National Taiwan University, 2003. [HQS00] Niels Haering, Richard J. Qian, and M. Ibrahim Sezan. A semantic event-detection approach and its application to detecting hunts in wildlife video. IEEE Transactions on Circuits and Systems for Video Technology, 10(6):857-868, 2000. [HWC05] Chia-Chiang Ho, Ja-Ling Wu, and Wen-Huang Cheng. A practical foveation-based rate-shaping mechanism for mpeg videos. IEEE Transactions on Circuits and Systems for Video Technology, 15(11):1365-1372, 2005. [HWG04] Miska M. Hannuksela, Ye-Kui Wang, and Moncef Gabbouj. Isolated regions in video coding. IEEE Transactions on Multimedia, 6(2):259-267, 2004. [HX05] Alan Hanjalic and Li-Qun Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143-154, 2005. [HZ04] Xian-Sheng Hua and Hong-Jiang Zhang. An attention-based decision fusion scheme for multimedia information retrieval. In Proceedings of the Pacific-Rim Conference on Multimedia (PCM), pages 1001-1010, 2004. [IK99] Laurent Itti and Christof Koch. A comparison of feature combination strategies for saliency-based visual attention systems. In Proceedings of the SPIE Human Vision and Electronic Imaging IV (HVEI), pages 473-482, 1999. [IKN98] Laurent Itti, Christof Koch, and Ernst Niebur. A model of saliencybased visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254-1259, 1998. [J91] Bernd Jahne. Spatio-Temporal Image Processing: Theory and Scientific Applications. Springer-Verlag, 1991. [Jim05] Ana Belen Benitez Jimenez. Multimedia knowledge: Discovery, classification, browsing, and retrieval. PhD Dissertation, Graduate School of Arts and Sciences, Columbia University, 2005. [JMF99] A.K. Jain, M.N. Murty, and P.J. Flynn. Data clustering: a review. ACM Computing Surveys, 31(3):264?23, 1999. [JP08] Satu Jumisko-Pyykko. 'i would like to see the subtitles and the face or at least hear the voice': effects of picture ratio and audio-video bitrate ratio on perception of quality in mobile television. Multimedia Tools and Applications, 36(1-2):167-184, 2008. [KMS05] Hendrik Knoche, John D.McCarthy, andM. Angela Sasse. Can small be beautiful? assessing image resolution requirements for mobile tv. In Proceedings of the ACM International Multimedia Conference (MM), 2005. [LCC+01] Keansub Lee, Hyun Sung Chang, Seong Soo Chun, Hyungseok Choi, and Sanghoon Sull. Perception-based image transcoding for universal multimedia access. In Proceedings of the IEEE International Conference on Image Processing (ICIP), volume 2, pages 475-478, 2001. [LCS03] Chia-Wen Lin, Yung-Chang Chen, and Ming-Ting Sun. Dynamic region of interest transcoding for multipoint video conferencing. IEEE Transactions on Circuits and Systems for Video Technology, 13(10):982-992, 2003. [LD06] Ying Li and Chitra Dorai. Instructional video content analysis using audio information. IEEE Transactions on Audio, Speech, and Language Processing, 14(6):2264-2274, 2006. [Len95] Douglas B. Lenat. CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11):33-38, 1995. [LG04] Jose A. Lay and Ling Guan. Retrieval for color artistry concepts. IEEE Transactions on Image Processing, 13(3):326-339, 2004. [LG05] Feng Liu and Michael Gleicher. Automatic image retargeting with fisheye-view warping. In Proceedings of the ACM symposium on User Interface Software and Technology (UIST), 2005. [LL01] Lin-Shan Lee and Yumin Lee. Voice access of global information for broad-band wireless: technologies of today and challenges of tomorrow. Proceedings of the IEEE, 89(1):41-57, 2001. [LLH03] Ho Young Lee, Ho Keun Lee, and Yeong Ho Ha. Spatial color descriptor for image retrieval and video segmentation. IEEE Transactions on Multimedia, 5(3):358-367, 2003. [LMP01] John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (ICML), pages 282-289, 2001. [LS04] Hugo Liu and Push Singh. ConceptNet - a practical commonsense reasoning tool-kit. BT Technology Journal, 22(4):211-226, 2004. [LTM03] Joo-Hwee Lim, Qi Tian, and Philippe Mulhem. Home photo content modeling for personalized event-based retrieval. IEEE Multimedia, 10(4):28-37, 2003. [LXMZ03] Hao Liu, Xing Xie, Wei-Ying Ma, and Hong-Jiang Zhang. Automatic browsing of large pictures on mobile devices. In Proceedings of the ACM International Multimedia Conference (MM), pages 148-155, 2003. [May79] Peter S. Maybeck. Stochastic Models, Estimation, and Control, volume 1. Academic Press, 1979. [MHZL07] Tao Mei, Xian-Sheng Hua, He-Qin Zhou, and Shipeng Li. Modeling and mining of users capture intention for home videos. IEEE Transactions on Multimedia, 9(1):66-77, 2007. [MLZL02] Yu-Fei Ma, Lie Lu, Hong-Jiang Zhang, and Mingjing Li. A user attention model for video summarization. In Proceedings of the ACM International Multimedia Conference (MM), pages 533-542, 2002. [MSL99] Rakesh Mohan, John R. Smith, and Chung-Sheng Li. Adapting multimedia internet content for universal access. IEEE Transactions on Multimedia, 1(1):104-114, 1999. [mys] MySpace: http://www.myspace.com/. [nba] NBA: http://www.nba.com/. [net] Netflix: http://www.netflix.com/. [NH02] Milind R. Naphade and Thomas S. Huang. Extracting semantics from audiovisual content: The final frontier in multimedia retrieval. IEEE Transactions on Neural Networks, 13(4):793-810, 2002. [NI02] Vidhya Navalpakkam and Laurent Itti. A goal oriented attention guidance model. Lecture Notes in Computer Science, 2525:453-461, 2002. [NPZ03] Chong-Wah Ngo, Ting-Chuen Pong, and Hong-Jiang Zhang. Motion analysis and segmentation through spatio-temporal slices processing. IEEE Transactions on Image Processing, 12(3):341-355, 2003. [NST+06] Milind Naphade, John R. Smith, Jelena Tesic, Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Lyndon Kennedy, Alexander Hauptmann, and Jon Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3):86-91, 2006. [NYHK05] Jeho Nam, Man Ro Yong, Youngsik Huh, and Munchurl Kim. Visual content adaptation according to user perception characteristics. IEEE Transactions on Multimedia, 7(3):435-445, 2005. [PB03] Fernando Pereira and Ian Burnett. Universal multimedia experiences for tomorrow. IEEE Signal Processing Magazine, 20(2):63-73, 2003. [PBC05] Son Lam Phung, Abdesselam Bouzerdoum, and Douglas Chai. Skin segmentation using color pixel classification: analysis and comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(1):148-154, 2005. [PECV07] Leevi Peltola, Cumhur Erkut, Perry R. Cook, and Vesa Valimaki. Synthesis of hand clapping sounds. IEEE Transactions on Audio, Speech, and Language Processing, 15(3):1021-1029, 2007. [PLD05] Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundacy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8):1226-1238, 2005. [PN04] Zailiang Pan and Chong-Wah Ngo. Structuring home video by snippet detection and pattern parsing. In Proceedings of the International Workshop on Multimedia Information Retrieval (MIR), pages 69-76, 2004. [PRO98] R. Paramesan, P. Ramaswamy, and S. Omatu. Regular moments for symmetric images. IEE Electronics Letters, 34(15):1481-1482, 1998. [PS00] Claudio M. Privitera and Lawrence W. Stark. Algorithms for defining visual regions-of-interest: comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9):970-982, 2000. [PSB05] Kedar A. Patwardhan, Guillermo Sapiro, and Marcelo Bertalmio. Video inpainting of occluding and occluded objects. In Proceedings of the IEEE International Conference on Image Processing (ICIP), volume 2, pages 69-72, 2005. [PT05] Costas Panagiotakis and George Tziritas. A speech/music discriminator based on rms and zero-crossings. IEEE Transactions on Multimedia, 7(1):155-166, 2005. [pva] http://teachmefinance.com/Scientific\_Terms/p-value.html/. [Rep87] Bruno H. Repp. The sound of two hands clapping: an exploratory study. Journal of the Acoustical Society of America, 81(4):1100-1109, 1987. [RH99] Yong Rui and Thomas S. Huang. Image retrieval: Current techniques, promising directions, and open issues. Journal of Visual Communication and Image Representation, 10(1):39-62, 1999. [RJ05] Lawrence Rowe and Ramesh Jain. ACM SIGMM retreat report on future directions in multimedia research. ACM Transactions on Multimedia Computing, Communications and Applications, 1(1):3-13, 2005. [RL02] Eric C. Reed and Jae S. Lim. Optimal multidimensional bit-rate control for video communication. IEEE Transactions on Image Processing, 11(8):873-885, 2002. [RS05] Zeeshan Rasheed and Mubarak Shah. Detection and representation of scenes in videos. IEEE Transactions on Multimedia, 7(6):1097-1105, 2005. [Sal83] Gerard Salton. Introduction to Modern Information Retrieval. McGraw-Hill, 1983. [Sar] Ramesh Sarukkai. Video search: opportunities and challenges. The keynote speech at 2005 ACM International Workshop on Multimedia Information Retrieval (MIR). [Spa01] Lisl M. Spangenberg. Timeless Traditions: A Couple's Guide to Wedding Customs Around the World. Universe Publishing, 2001. [STG+04] Vidya Setlur, Saeko Takagi, Michael Gleicher, Ramesh Raskar, and Bruce Gooch. Automatic image retargeting. Technical report, Computer Science Department, Northwestern University, 2004. [Sun02] Hari Sundaram. Segmentation, structure detection and summarization of multimedia sequences. PhD Dissertation, Graduate School of Arts and Sciences, Columbia University, 2002. [SWS+00] Arnold W.M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349-1380, 2000. [SWS05] Cees G.M. Snoek, Marcel Worring, and Arnold W.M. Smeulders. Early versus late fusion in semantic video analysis. In Proceedings of the ACM International Multimedia Conference (MM), pages 399-402, 2005. [SWvG+06] Cees G.M. Snoek, Marcel Worring, Jan C. van Gemert, Jan-Mark Geusebroek, and Arnold W.M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the ACM International Multimedia Conference (MM), pages 421-430, 2006. [TLS04] Belle L. Tseng, Ching-Yung Lin, and John R. Smith. Using MPEG-7 and MPEG-21 for personalizing video. IEEE Multimedia, 11(1):42-52, 2004. [tre] Trecvid: http://www-nlpir.nist.gov/projects/trecvid/. [TS06] Yuichiro Takeuchi and Masanori Sugimoto. Video summarization using personal photo libraries. In Proceedings of the International Workshop on Multimedia Information Retrieval (MIR), 2006. [Tun02] Yi-Shin Tung. The design and implementation of an MPEG-4 based universal scalable video codec in layered path-tree structure. PhD Dissertation, Department of Computer Science and Information Engineering, National Taiwan University, 2002. [TV01] Ba Tu Truong and Svetha Venkatesh. Determining dramatic intensification via flashing lights in movies. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 61-64, 2001. [TV07] Ba Tu Truong and Svetha Venkatesh. Video abstraction: a systematic review and classification. ACM Transactions on Multimedia Computing, Communications and Applications, 3(1):1-37, 2007. [TWC+08] Ming-Chun Tien, Yi-Tang Wang, Chen-Wei Chou, Kuei-Yi Hsieh, Wei-Ta Chu, and Ja-Ling Wu. Event detection in tennis matches based on video data mining. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2008. [vBSE+03] Peter van Beek, John R. Smith, Touradj Ebrahimi, Teruhiko Suzuki, and Joel Askelof. Metadata-driven multimedia access. IEEE Signal Processing Magazine, 20(2):40-52, 2003. [War06] Diane Warner. Diane Warner's Contemporary Guide to Wedding Ceremonies. New Page Books, 2006. [WB04] Greg Welch and Gary Bishop. An introduction to the kalman filter. Technical report, Department of Computer Science, University of North Carolina at Chapel Hill, 2004. [WC06] Hee Lin Wang and Loong-Fah Cheong. Affective understanding in film. IEEE Transactions on Circuits and Systems for Video Technology, 16(6):689-704, 2006. [WCC+07] Chia-Wei Wang, Wen-Huang Cheng, Jun-Cheng Chen, Shu-Sian Yang, and Ja-LingWu. Film narrative exploration through analyzing aesthetic elements. In Proceedings of the International Multimedia Modeling Conference (MMM), 2007. [WKCK07] Yong Wang, Jae-Gon Kim, Shih-Fu Chang, and Hyung-Myung Kim. Utility-based video adaptation for universal multimedia access (uma) and content-based utility function prediction for real-time video transcoding. IEEE Transactions on Multimedia, 9(2):213-220, 2007. [WLC06] Huan Wang, Song Liu, and Liang-Tien Chia. Does ontology help in image retrieval?: a comparison between keyword, text ontology and multi-modality ontology approaches. In Proceedings of the ACM International Multimedia Conference (MM), pages 109-112, 2006. [WLH00] Yao Wang, Zhu Liu, and Jin-Cheng Huang. Multimedia content analysis-using both audio and visual clues. IEEE Signal Processing Magazine, 17(6):12-36, 2000. [WOZ01] Yao Wang, Jorn Ostermann, and Ya-Qin Zhang. Video Processing and Communications. Prentice Hall, 2001. [XLS05] Jun Xin, Chia-Wen Lin, and Ming-Ting Sun. Digital video transcoding. Proceedings of the IEEE, 93(1):84-97, 2005. [yah] Yahoo!: http://www.yahoo.com/. [YHZ02] Pei Yin, Xian-Sheng Hua, and Hong-Jiang Zhang. Automatic time stamp extraction system for home videos. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pages 73-76, 2002. [you] YouTube: http://www.youtube.com/. [Zet98] Herbert Zettl. Sight, Sound, Motion: Applied Media Aesthetics. Wadsworth, 3rd edition, 1998. [ZK01] Tong Zhang and C.C. Jay Kuo. Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing. Kluwer, 2001. [ZS05] Yun Zhai and Mubarak Shah. Automatic segmentation of home videos. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2005. [ZSX07] Amit Zunjarwad, Hari Sundaram, and Lexing Xie. Contextual wisdom: social relations and correlations for multimedia event annotation. In Proceedings of the ACM International Multimedia Conference (MM), pages 615-624, 2007. [ZWL01] Hua Zhong, Liu Wenyin, and Shipeng Li. Interactive tracker - a semi-automatic video object tracking and segmentation system. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2001.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9784	-
dc.description.abstract	在普及媒體環境中，內容調適是用以實現普遍多媒體存取的一種關鍵技術。具體而言，其藉由多媒體內容的轉換以使轉換後之多媒體內容符合相對應的使用環境。從個人化應用的角度來看，有效的內容調適可得益於對多媒體內容語意的深刻理解。因此，本論文的目標即在於提供一套具系統化之研究方法以提昇自動化多媒體調適的語意層次。在本論文中，我們提出一個通用型之調適架構以及相對應之基本設計原則。藉由導入特定領域知識，我們適度跨越存在於低階可計算特徵值與高階語意概念間之語意鴻溝，並藉此有效開發與其所屬之調適運算以求得使用者多媒體經驗之最佳化。在前述所提出的架構之上，我們的研究聚焦於視訊內容之語意調適，其中具體探討兩種用於語意模型化之方法，分別是以物件為基礎與以事件為基礎之方法。在以物件為基礎之方法中，我們建構一個可用於定位視訊中具語意性物件之視覺模型，以提昇使用者在小螢幕行動裝置上觀賞高畫質專業影片時之瀏覽經驗。另一方面，在以事件為基礎之方法中，我們同時利用視訊中之視覺與聽覺資訊，以描繪具語意性事件之多媒體特性，並應用於滿足使用者對於長時間家庭影片之實際瀏覽需要。此兩個系統可視為前述所提出調適架構之具體技術實現，並可藉此顯現自動化高階語意分析之可行性與有效性。	zh_TW
dc.description.abstract	In pervasive media environments, adaptation is one key technology to support universal multimedia access by transforming multimedia contents to fit the usage environments. In terms of personalization, effective adaptation can greatly benefit from taking into account the semantics of multimedia contents. The goal of this dissertation is to be able to provide systematic approaches to improve automatic multimedia adaptation at the semantic level. In this dissertation, a generic adaptation framework and the fundamental design principles are proposed. By exploiting specific domain knowledge, we bridge the gap between low-level computational features and high-level semantic concepts, whereby the associated adapting operations can be effectively designed to maximize the user’s multimedia experience. Based on the proposed framework, our works focus on the semantic adaptation of video contents, where two alternative approaches for semantics modeling are investigated: the object-based and the event-based. In the object-based approach, a visual model is constructed for locating semantic video objects so as to improve the user’s browsing experience of high-quality professional videos on the devices with small displays. In the event-based approach, both the visual and aural information are exploited to characterize semantic video events that can be used to benefit the user’s navigation in hours-long home videos. The two systems can be viewed as the technical realization of the proposed adaptation framework and demonstrate the effectiveness of automatic high-level semantics analysis.	en
dc.description.provenance	Made available in DSpace on 2021-05-20T20:41:10Z (GMT). No. of bitstreams: 1 ntu-97-D93944001-1.pdf: 5363958 bytes, checksum: 4530a0e8874ca164617589ee08e90633 (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	Acknowledgements v Curriculum Vita vii Abstract xi List of Figures xvi List of Tables xix 1 Introduction 1 1.1 Motivation 1 1.2 Semantic Multimedia Content Adaptation 4 1.2.1 A Generic Framework 4 1.2.2 From Signal to Semantic Levels 8 1.2.3 Adaptive Optimization 12 1.3 Problem Statement 14 1.4 Summary of Contributions 15 1.4.1 Framework Development for Semantic Adaptation 16 1.4.2 Video Adaptation Based on Semantic Objects 16 1.4.3 Video Adaptation Based on Semantic Events 17 1.5 Organization of the Dissertation 17 2 Basics and Literature Review 19 2.1 Semantic Concept Ontology 19 2.1.1 Typical Examples 20 2.1.2 Relationship Building 23 2.2 Semantic Concept Analysis 24 2.3 Semantic Content Adaptation 26 2.3.1 Adaptation Taxonomy 26 2.3.2 Adaptation Strategy 28 2.4 Framework Correspondence 29 2.4.1 Semantic Object Based Video Adaptation 29 2.4.2 Semantic Event Based Video Adaptation 30 2.5 Summary 31 3 Semantic Object Based Video Adaptation 33 3.1 Introduction 34 3.2 Related Work 37 3.3 User-Interest Finding 41 3.3.1 Visual Attention Modeling 41 3.3.2 Video ROIs Determination 47 3.4 Content Recomposition 51 3.4.1 UIOs Extraction 53 3.4.2 Background Repairing 54 3.4.3 Media Aesthetics Based Video Objects Reintegration 55 3.5 Experimental Results 60 3.5.1 Recomposition Results 61 3.5.2 User Studies 65 3.5.3 Time Efficiency Analysis 74 3.6 Summary 75 4 Semantic Event Based Video Adaptation 83 4.1 Introduction 84 4.2 Related Work 87 4.3 Wedding Event Taxonomy 90 4.4 Event Features Development and Extraction 92 4.4.1 Key Observations 92 4.4.2 Selected Features for Event Modeling 96 4.5 Wedding Modeling 107 4.5.1 Wedding Event Modeling 109 4.5.2 Event Transition Modeling 112 4.5.3 Wedding Segmentation Using HMM 113 4.6 Experimental Results 115 4.6.1 Event Recognition Analysis 116 4.6.2 Video Segmentation Analysis 119 4.6.3 Performance Comparisons with LCRF Models 122 4.6.4 Extension to the Scenario with Known Event Ordering 124 4.7 Summary 126 5 Conclusions and Future Work 133 5.1 Conclusions 133 5.2 Future Research 135 Bibliography 137
dc.language.iso	en
dc.title	以物件與事件為基礎之視訊內容調適架構	zh_TW
dc.title	A Semantic Framework for Object-Based and Event-Based Video Content Adaptation	en
dc.type	Thesis
dc.date.schoolyear	96-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	陳良弼,杭學鳴,李琳山,廖弘源,許永真,李素瑛
dc.subject.keyword	多媒體內容調適,語意分析,視訊物件偵測,視訊事件偵測,	zh_TW
dc.subject.keyword	Multimedia Content Adaptation,Semantic Analysis,Video Object Detection,Video Event Detection,	en
dc.relation.page	150
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2008-07-24
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf	5.24 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。