請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34311完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 吳家麟(Ja-Ling Wu) | |
| dc.contributor.author | Wei-Ta Chu | en |
| dc.contributor.author | 朱威達 | zh_TW |
| dc.date.accessioned | 2021-06-13T06:02:32Z | - |
| dc.date.available | 2008-06-27 | |
| dc.date.copyright | 2006-06-27 | |
| dc.date.issued | 2006 | |
| dc.date.submitted | 2006-06-21 | |
| dc.identifier.citation | [Adai02] Adair, R.K., “The physics of baseball,” Harper Collins, New York, 2002.
[ASQA06] ASQA, Academia Sinica Question Answering System, http://asqa.iis.sinica.edu.tw/clqa/ [Arij91] Arijon, D., “Grammar of the film language,” Sliman-James Press, 1991. [Arik03] Ariki, Y., Kumano, M., and Tsukada, K., “Highlight scene extraction in real time from baseball live video,” Proceedings of ACM International Workshop on Multimedia Information Retrieval, pp. 209-214, 2003. [Bach96] Bach, J., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R., Jain, R., and Shu, C., “The virage image search engine: an open framework for image management,” Proceedings of SPIE Storage and Retrieval for Image and Video Databases, pp. 76-87, 1996. [Bach05] Bach, N.H., Shinoda, K., and Furui, S., “Robust highlight extraction using multi-stream hidden Markov models for baseball video,” Proceedings of IEEE International Conference on Image Processing, vol. 3, pp. 173-176, 2005. [Baba04] Babaguchi, N., Kawai, Y., Ogura, T., and Kitahashi, T., “Personalized abstraction of broadcasted American football video by highlight selection,” IEEE Transactions on Multimedia, vol. 6, no. 4, 2004, pp. 575-586. [Bahi05] Bahill, A.T., Baldwin, D.G., and Venkateswaren, J., “Predicting a baseball’s path,” American Scientist, vol. 93, no. 3, pp. 218-225, 2005. [Bart05] Bartsch, M.A., and Wakefield, G.H., “Audio thumbnailing of popular music using chroma-based representations,” IEEE Transactions on Multimedia, vol. 7, no. 1, pp. 96-104, 2005. [Beni05] Benitez, A.B., “Multimedia knowledge: discovery, classification, browsing, and retrieval,” PhD Thesis Graduate School of Arts and Sciences, Columbia University, 2005. [Bert05] Bertini, M., Del Bimbo, A., and Nunziati, W., “Highlights modeling and detection in sports videos,” Pattern Analysis and Applications, 2005. [Bow02] Bow, S.T., “Pattern Recognition and Image Preprocessing,” Marcel Dekker, 2002. [Brau98] Braudy, L., and Cohen, M., “Film theory and criticism: introductory readings,” Oxford University Press, 1998. [Burn03] Burnett, I, Van de Walle, R., Hill, K., Bormans, J., and Pereira, F., “MPEG-21: goals and achievements,” IEEE Multimedia, Oct.-Dec., pp. 60-70, 2003. [Cai03] Cai, R., Lu, L., Zhang, H.-J., Cai, L.H., “Highlight sound effects detection in audio stream,” Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 3, pp. 37-40, 2003. [Cawl00] Cawley, G.C., Support vector machine toolbox, http://theoval.sys.uea.ac.uk/svm/toolbox/, 2000. [Chan98] Chang, S.-F., Chen, W., and Sundaram, H., “Semantic visual templates – linking features to semantics,” Proceedings of IEEE International Conference on Image Processing, vol. 3, pp. 531-535, 1998. [Chan01] Chang, S.-F., Sikora, T., Purl, A., “Overview of MPEG-7 standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 688-695, 2001. [Chan05] Chang, S.-F., and Vetro, A., “Video adaptation: concepts, technologies, and open issues,” Proceedings of the IEEE, vol. 94, no. 1, pp. 148-158, 2005. [Chen98] Chen, B., Wang, H.-W., Chien, L.-F., and Lee, L.-S., “A*-admissible key-phrase spotting with sub-syllable level utterance verification,” Proceedings of IEEE International Conference on Spoken Language Processing, 1998. [Chen03] Cheng, W.-H., Chu, W.-T., and Wu, J.-L., “Semantic Context Detection based on Hierarchical Audio Models,” Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 109-115, 2003. [Chen04] Chen, H.-W., Kuo, J.-H., Chu, W.-T., and Wu, J.-L., “Action Movies Segmentation and Summarization Based on Tempo Analysis,” Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 251-258, 2004. [Chu04] Chu, W.-T., Cheng, W.-H., Wu, J.-L., and Hsu, Y.-J., “A Study of Semantic Context Detection by Using SVM and GMM Approaches,” Proceedings of the IEEE International Conference on Multimedia & Expo, vol. 3, pp. 1591-1594, 2004. [Chu05-1] Chu, W.-T., and Wu, J.-L., “Explicit Semantic Events Detection and Development of Realistic Applications for Broadcasting Baseball Videos,” submitted to Multimedia Tools and Applications, 2005. [Chu05-2] Chu, W.-T., Cheng, W.-H., and Wu, J.-L., “Semantic Context Detection Using Audio Event Fusion,” to appear in the EURASIP Journal on Applied Signal Processing, 2005. [Chu05-3] Chu, W.-T., Cheng, W.-H., Hsu, J. Y.-J., and Wu, J.-L., “Towards Semantic Indexing and Retrieval Using Hierarchical Audio Models,” ACM Multimedia Systems Journal, vol. 10, no. 6, pp. 570-583, 2005. [Chu05-4] Chu, W.-T., and Wu, J.-L., “Integration of Rule-based and Model-based Methods for Baseball Event Detection,” Proceedings of IEEE International Conference on Multimedia & Expo, pp. 137-140, 2005. [Chu05-5] Chu, W.-T., Cheng, W.-H., and Wu, J.-L., “Generative and Discriminative Modeling toward Semantic Context Detection in Audio Tracks,” Proceedings of the 11th International Multimedia Modelling Conference, pp. 38-45, 2005. [Chu05-6] Chu, W.-T., and Wu, J.-L., “Detection of Spirited Incidental Music in Movies,” Proceedings of Workshop on Computer Music and Audio Technology, 2005. [Chu06-1] Chu, W.-T., and Wu, J.-L., “Development of Realistic Applications Based on Explicit Event Detection in Broadcasting Baseball Videos,” Proceedings of International Multimedia Modeling Conference, pp.12-19, 2006. [Chu06-2] Chu, W.-T., Wang, C.-W., and Wu, J.-L. “Extraction of baseball trajectory and physics-based validation for single-view baseball video sequences,” accepted by IEEE International Conference on Multimedia & Expo, 2006. [CMU06] The CMU Pronouncing Dictionary, http://www.speech.cs.cmu.edu/cgi-bin/cmudict [Cove67] Cover, T.M., and Hart, P.E., “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, pp. 21-27, 1967. [Cowi01] Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., and Taylor, J.G., “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, Jan., 2001, pp. 32-80. [CPBL06] Chinese Professional Baseball League, http://www.cpbl.com.tw [Day05] Day, M.-Y., Lee, C.-W., Wu, S.-H., Ong, C.-S., Hsu, W.-L., “An integrated knowledge-based and machine learning approach for chinese question classification,” Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, pp. 620-625, 2005. [Dimi02] Dimitrova, N., Zhang, H.-J., Shahraray, B., Huang, T.S., and Zakhor, A., “Applications of video-content analysis and retrieval,” IEEE Multimedia, vol. 3, pp. 42-55, 2002. [Dora02] Dorai, C., and Venkatesh, S., “Media computing: computation media aesthetics,” Kluwer Academic Publisher, 2002. [Duan03] Duan, L.-Y., Xu, M., Chua, T.-S., Tian, Q., and Xu, C.-S. “A mid-level representation framework for semantic sports video analysis,” Proceedings of ACM Multimedia Conference, pp. 33-44, 2003. [Duda01] Duda, R.O., Hart, P.E., Stork, D.G., “Pattern Classification,” John Wiley & Sons, 2001. [Ekin03] Ekin, A., Tekalp, A.M., and Mehrota, R., “Automatic soccer video analysis and summarization,” IEEE Transactions on Image Processing, vol. 12, no. 7, 2003, pp. 796-807. [Fisc95] Fischer, S., Lienhart, R., and Effelsberg, W., “Automatic recognition of film genres,” Proceedings of ACM Multimedia, pp. 295-304, 1995. [Flic95] Flickner, M., Petkovic, D., Steele, D., Yanker, P., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., and Lee, D., “Query by image and video content: the QBIC system,” IEEE Computer, vol. 28, no. 9, pp. 23-32, 1995. [From97] Fromkin, V., and Rodman, R., “An introduction to language,” Harcourt Brace, 6th edition, 1997. [Ghah06] Ghahramani, Z., Software written in Matlab, http://www.gatsby.ucl.ac.uk/~zoubin/software.html [GHMM] General Hidden Markov Model Library (GHMM), http://www.ghmm.org [Guez02] Guezic, A., “Tracking pitches for broadcast television,” IEEE Computer, vol. 35, no. 3, pp. 38-43, 2002. [Haer00] Haering, N., Qian, R.J., and Sezan, M.I., “A semantic event-detection approach and its application to detecting hunts in wildlife video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 10, no. 6, 2000. [Han02] Han, M., Hua, W., Xu, W., and Gong, Y., “An integrated baseball digest system using maximum entropy method,” Proceedings of ACM Multimedia Conference, 2002, pp. 347-350. [Hanj02] Hanjalic, A., “Shot-boundary detection: unraveled and resolved?” IEEE Transactions on Circuits System and Video Technology, vol. 2, pp. 90-105, 2002. [Hsu02] Hsu, C.-W. and Lin, C.-J., “A comparison of methods for multiclass support vector machines,” IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, 2002. [Hsu06] Hsu, C.-W., Chang, C.-C., and Lin, C.-J., “A practical guide to support vector machine,” http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, 2006. [HTK] Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk/ [Hua02] Hua, W., Han, W., and Gong, Y., “Baseball scene classification using multimedia features,” Proceedings of IEEE International Conference on Multimedia & Expo, 2002, pp. 821-824. [Huan01] Huang, X., Acero, A., and Hon, H.-W., “Spoken language processing: a guide to theory, algorithm, and system development,” Prentice Hall, 2001. [Hyva01] Hyvarinen, A., Karhunen, J., and Oja, E., “Independent Component Analysis,” John Wiley & Sons, 2001. [Jain00] Jain, A.K., Duin, R.P.W., and Mao, J., “Statistical pattern recognition: a review,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, 2000. [Khot90] Khotanzad, A., and Hong, Y.-H., “Invariant image recognition by Zernike moments,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 5, pp. 489-497, 1990. [Kitt98] Kitter, J., Hatef, M., Duin, R.D.W., and Matas, J., “On combining classifiers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226-239, 1998. [Lay06] Lay, J.A., and Guan, L., “Semantic retrieval of multimedia by concept language,” IEEE Signal Processing Magazine, vol. 23, no. 2, pp. 115-123, 2006. [Lee03] Lee, H.Y., Lee, H.K., and Ha, Y.H., “Spatial color descriptor for image retrieval and video segmentation,” IEEE Transactions on Multimedia, vol. 5, no. 3, pp. 358-367, 2003. [Leon04] Leonardi, R., Migliorati, P., and Prandini, M., “Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 5, 2004, pp. 634-643. [Li00] Li, S.Z., “Content-based classification and retrieval of audio using the nearest feature line method,” IEEE Transactions on Speech and Audio Processing, vol. 8, no. 5, pp. 619-625, 2000. [Li01] Li, Y., Zhong, T., and Tretter, D., “An overview of video abstraction techniques,” Technical Report, HPL-2001-191, Hewlett-Packard Company, 2001. [Li04] Li, B., Errico, J.H., Pan, H., and Sezan, I., “Bridging the semantic gap in sports video retrieval and summarization,” Journal of Visual Communication and Image Representation, vol. 15, 2004, pp. 393-424. [Lian04] Liang, C.-H., Kuo, J.-H., Chu, W.-T., and Wu, J.-L., “Semantic Units Detection and Summarization of Baseball Videos,” Proceedings of the IEEE International Midwest Symposium on Circuits and Systems, vol. 1, pp. 297-300, 2004. [Lian05] Liang, C.-H., Chu, W.-T., Kuo, J.-H., Wu, J.-L., and Cheng, W.-H., “Baseball Event Detection Using Game-Specific Feature Sets and Rules,” Proceedings of IEEE International Symposium on Circuits and Systems, pp. 3829-3832, 2005. [LIBS] LIBSVM – A library for support vector machine, http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html, 2001. [Lien99] Lienhart, R., “Comparison of automatic shot boundary detection algorithms,” Proceedings of SPIE Storage and Retrieval for Still Image and Video Databases VII, vol. 3656, pp. 290-301, 1999. [Lin03] Lin, W.-H., Hauptmann, A., “Meta-classification: Combining multimodal classifiers,” Zaiane, O.R., Simoff, S., Djeraba, C. (eds.) Mining Multimedia and Complex Data, Springer, Berlin Heidelberg New York, pp. 217-231, 2003. [Liu98] Liu, Z., Huang, J., and Wang, Y., “Classification of TV programs based on audio information using hidden Markov model,” Proceedings of the IEEE Signal Processing Society Workshop on Multimedia Signal Processing, pp. 27-32, 1998. [Lu02] Lu, L., Zhang, H.-J., and Jiang, H., “Content analysis for audio classification and segmentation,” IEEE Transactions on Speech and Audio Processing, vol. 7, pp. 504-516, 2002. [Lu03] Lu, L., and Zhang, H.-J., “Automatic extraction of music snippets,” Proceedings of the ACM Multimedia Conference, pp. 140-147, 2003. [Madd06] Maddage, N.C., “Automatic structure detection for popular music,” IEEE Multimedia, vol. 13, no. 1, pp. 65-77, 2006. [Mart02-1] Martinez, J.M., Koenen, R., and Pereira, F., “MPEG-7: the generic multimedia content description standard, part 1,” IEEE Multimedia, vol. 9, no. 2, pp. 78-87, 2002. [Mart02-2] Martinez, J.M. “Standards - MPEG-7 overview of MPEG-7 description tools, part 2,” IEEE Multimedia, vol. 9, no. 3, pp. 83-93, 2002. [MLB06] Major League Baseball, http://www.mlb.com [Moha99] Mohan, R., Smith, J.R., and Li, C.-S., “Adapting multimedia content for universal access,” IEEE Transactions on Multimedia, vol. 1, no. 1, pp. 104-114, 1999. [Mona00] Monaco, J., “How to read a film: the world of movies, media, and multimedia: language, history, theory” Oxford University Press, 2000. [Monc03] Moncrieff, S., Venkatesh, S., and Dorai, C., “Horror film genre typing and scene labeling via audio analysis,” Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 2, pp. 193-196, 2003. [Mulh03] Mulhem, P., Kankanhalli, M.S., Yi, J., and Hassan, H., “Pivot vector space approach for audio-video mixing,” IEEE Multimedia, vol. 10, no. 2, pp. 28-40, 2003. [Murp05] Murphy, K., Hidden Markov model (HMM) Toolbox for Matlab, http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html [Naph98] Naphade, M.R., Kristjansson, T., Frey, B., Huang, T.S., “Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia system,” Proceedings of the IEEE International Conference on Image Processing, vol. 3, pp. 536-540, 1998. [Naph01] Naphade, M.R., “A probabilistic framework for mapping audio-visual features to high-level semantics in terms of concepts and context,” PhD dissertation, University of Illinois at Urbana-Champaign, 2001. [Naph02] Naphade, M.R., and Huang, T.S., “Extracting semantics from audiovisual content: the final frontier in multimedia retrieval,” IEEE Transactions on Neural Network, vol. 13, no. 4, pp. 793-810, 2002. [Nepa01] Nepal, S., Srinivasan, U., and Reynolds, G., “Automatic detection of goal segments in basketball videos,” Proceedings of ACM Multimedia Conference, pp. 261-269, 2001. [Ngo05] Ngo, C.-W., Ma, Y.-F., and Zhang, H.-J., “Video summarization and scene detection by graph modeling,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 296-305, 2005. [Pfei96] Pfeiffer, S., Lienhart, R., Fischer, S., and Effelsberg, W., “Abstracting digital movies automatically,” Journal of Visual Communication and Image Representation, vol. 4, pp. 345-353, 1996. [Plat00] Platt, J.C., Cristianini, N., and Shawe-Taylor, J., “Large margin DAGs for multiclass classification,” Advances in Neural Information Processing Systems, Cambridge, MA: MIT Press, vol. 12, pp. 547-553, 2000. [Rabi89] Rabiner, L.R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989. [Rabi93] Rabiner, L., and Juang, B.-H., “Fundamentals of speech recognition,” Prentice Hall, 1993. [Reim06] Reimers, U.H. , “DVB-the family of international standards for digital video broadcasting,” Proceedings of IEEE, vol. 94, no. 1, pp. 173-182, 2006. [Rui98] Rui, Y., Huang, T.S., Ortega, M., and Mehrotra, S., “Relevance feedback: a power tool in interactive content-based image retrieval,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, no. 5, pp. 644-655, 1998. [Rui99] Rui, Y., Huang, T.S., and Chang, S.-F., “Image retrieval: current techniques, promising directions and open issues,” Journal of Visual Communication and Image Representation, vol. 10, no. 4, pp. 39-62, 1999. [Rui00] Rui, Y., Gupta, A., and Acero, A., “Automatically extracting highlights for tv baseball programs,” Proceedings of ACM Multimedia Conference, 2000, pp. 105-115. [Seth03] Sethy, A., Narayanan, S., “Split-lexicon based hierarchical recognition of speech using syllable and word level acoustic units,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 772-775, 2003. [Shih03] Shih, H.-C., and Huang, C.-L., “A semantic network modeling for understanding baseball video,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. 820-823, 2003. [Smeu00] Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., and Jain, R., “Content-based image retrieval at the end of early year,” IEEE Transactions of Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, 2000. [Smit96] Smith, J.R., and Chang, S.-F., “Visualseek: a fully automated content-based image query system,” Proceedings of ACM Multimedia, pp. 87-98, 1996. [Smit03] Smith, J.R., Naphade, M., and Natsev, A., “Multimedia semantic indexing using model vectors,” Proceedings of ICME, vol. 2, pp. 445-448, 2003. [Snoe06] Snoek, C.G.M., Worring, M., and Smeulders, A.W.M., “Early versus late fusion in semantic video analysis,” Proceedings of ACM Multimedia, pp. 399-402, 2005. [Soun06] SoundIdeas Sound Effects Library, http://www.sound-ideas.com/ [Stol97] Stolfo, S., Prodromidis, A., Tselepis, S., Lee, W., Fan, D., Chan, P., “JAM: Java agents for meta-learning over distributed databases,” Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 74-81, 1997. [Theo04] Theobalt, C., Albrecht, I., Haber, J., Magnor, M., Seidel, H.-P., “Pitching a baseball: tracking high-speed motion with multi-exposure images,” Proceedings of ACM SIGGRAPH, pp. 540-547, 2004. [Tjon04] Tjondronegoro, D., Chen, Y.-P. P., and Pham, B., “Integrating highlights for more complete sports video summarization,” IEEE Multimedia, Oct.-Dec., 2004, pp. 22-37. [TREC06] TREC Video Retrieval Evaluation, http://www-nlpir.nist.gov/projects/trecvid/ [Tsen04] Tseng, B.L., Lin, C.-Y., and Smith, J.R., “Using MPEG-7 and MPEG-21 for personalizing video,” IEEE Multimedia, vol. 11, no. 1, pp. 42-52, 2004. [Tzan02] Tzanetakis, G., and Cook, P., “Musical genre classification of audio signals,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293-302, 2002. [Vapn98] Vapnik, V.N., “Statistical Learning Theory,” Wiley, New York, 1998. [Vetr05] Vetro, A., and Trimmerer, C., “Digital item adaptation: overview of standardization and research activities,” IEEE Transactions on Multimedia, vol. 7, no. 3, pp. 418-426, 2005. [Vide06] Videoland Sports Channel, http://sport.videoland.com.tw/ [Wact96] Wactlar, H., Kanade, T., Smith, M, and Stevens, S., “Intelligent access to digital video: the Informedia project,” IEEE Computer, vol. 29, no. 5, pp. 46-52, 1996. [Wang00] Wang, Y., Liu, Z., Huang, J.C., “Multimedia content analysis using both audio and visual cues,” IEEE Signal Processing Magazine, vol. 17, no. 6, pp. 12-36, 2000. [Wang04-1] Wang, L., Lew, M., and Xu, G., “Offense based temporal segmentation for event detection in soccer video,” Proceedings of ACM International Workshop on Multimedia Information Retrieval, pp. 259-266, 2004. [Wang04-2] Wang, J., Xu, C., Chng, E., Wan, K., and Tian, Q., “Automatic replay generation for soccer video broadcasting,” Proceedings of ACM Multimedia Conference, pp. 32-39, 2004. [Wang04-3] Wang, J., Xu, C., Chng, E., and Tian, Q., “Sports highlight detection from keyword Sequences using HMM,” Proceedings of IEEE International Conference on Multimedia and Expo, pp. 599-602, 2004. [Welc04] Welch, G., Bishop, G., “An introduction to the Kalman filter,” Technical Report no. TR 95-041, University of North Carolina at Capel Hill, 2004. [Well05] Welling, M. “Support vector machines,” Lecture notes in http://www.ics.uci.edu/~welling/teaching/KernelsICS273B/Kernels.html, 2005. [Xie04] Xie, L., Xu, P., Chang, S.-F., Divakaran, A., and Sun, H., “Structure analysis of soccer video with domain knowledge and hidden Markov models,” Pattern Recognition Letters, vol. 25, no. 7, 2004, pp. 767-775. [Xion03] Xiong, Z., Radhakrishnan, R., Divakaran, A., and Huang, T.S., “Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. 632-635, 2003. [Xu03] Xu, G., Ma, Y.-F., Zhang, H.-J., and Yang, S., “An HMM based semantic analysis framework for sports game event detection,” Proceedings of IEEE International Conference on Image Processing, vol. 1, 2003, pp. 25-28. [Xu04] Xu, H., and Chua, T.-S., “The fusion of audio-visual features and external knowledge for event detection in team sports video,” Proceedings of ACM International Workshop on Multimedia Information Retrieval, 2004, pp. 127-134. [Yeo95] Yeo, B.L., and Liu, B., “Rapid scene change detection on compressed video,” IEEE Transactions on Circuits System and Video Technology, vol. 6, pp. 533-544, 1995. [Yu03] Yu, X., Xu, C., Leong, H.W., Tian, Q., Tang, Q., and Wan, K.W., “Trajectory-based ball detection and tracking with applications to semantic analysis of broadcasting soccer video,” Proceedings of ACM Multimedia Conference, 2003, pp. 11-20. [Yu05] Yu, X., and Farin, D., “Current and emerging topics in sports video processing,” Proceedings of IEEE International Conference on Multimedia & Expo, pp. 526-529, 2005. [Zett99] Zettl, H., “Sight sound motion: applied media aesthetics,” Belmont, CA: Wadsworth Pbulishing, 1999. [Zhan95] Zhang, H.-J., Low, C.Y., Smoliar, S.W., and Wu, J.H., “Video parsing, retrieval and browsing: an integrated and content-based solution,” Proceedings of ACM Multimedia, pp. 15-24, 1995. [Zhan98] Zhang, T. and Kuo, C.-C. J., “Hierarchical system for content-based audio classification and retrieval,” Proceedings of SPIE Multimedia Storage Archive and System, vol. 3, no. 3572, pp. 398-409, 1998. [Zhan02] Zhang, D., and Chang, S.-F., “Event detection in baseball video using superimposed caption information,” Proceedings of ACM Multimedia Conference, 2002, pp. 315-318. [Zhon04] Zhong, D., and Chang, S.-F., “Real-time view recognition and event detection for sports video,” Journal of Visual Communication and Image Representation, vol. 15, 2004, pp. 330-347. [Zilc01] Zilca, R.D., “Text-independent speaker verification using covariance modeling,” IEEE Signal Processing Letters, vol. 8, no. 4, pp. 97-99, 2001. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34311 | - |
| dc.description.abstract | 將內容分析技術推向語意層級是近年來在多媒體領域中急速發展的研究課題。此類技術的分析結果較能符合使用者的需求,也讓內容管理與應用變得更加有效率。有別於傳統以內容為基礎的檢索技術,數位內容語意分析結合圖型識別、機器學習的技術與特定製作原則、領域知識來彌合低階特徵值與高階語意之間的鴻溝。
基於機器學習與圖型識別的技術,已有許多系統結合不同分類器、不同特徵值、或不同媒體型態的結果來進行語意分析。在本論文中,我們提出一個通用的架構來進行此類研究。其中,我們引入介於視聽特徵值與語意概念之間的中介資訊來輔助分析。 我們發展了三個不同的系統,在電影、棒球影片、以及一般的運動影片中進行語意概念偵測。在動作電影中,我們透過聲音的資訊來偵測槍戰與飛車追逐等語意概念。我們採用統計方法來描述概念以及對應不同層次的語意。在棒球比賽中,我們基於畫面與語音的資訊,結合了以規則為基礎與以模型為基礎的方法來做語意概念偵測。總計有十三種不同的概念,如一壘安打、二壘安打、全壘打、三振等,可被偵測出來,也藉此我們可發展許多實際的應用。在一般的運動影片中,我們提出可用球的軌跡來輔助內容分析。一些新型態的語意概念,如棒球比賽中投手的球種,可因此被描述與偵測出來。這三大類研究都是基於我們所提的通用架構,也因此證實了此架構對於語意概念偵測的實用性。 | zh_TW |
| dc.description.abstract | Conducting content analysis approaching semantics level is an emerging trend in multimedia researches. Such kind of analysis matches users’ needs and facilitates content management and utilization in a more effective and reasonable way. Unlike conventional content-based retrieval or indexing, works on semantics analysis integrate techniques of statistical pattern recognition and machine learning with specific production rules or domain knowledge to bridge the semantic gap between low-level features and high-level semantics.
On the basis of machine learning and pattern recognition technologies, systems that combine analytical results from different classifiers, different features, or different modalities are developed. In this dissertation, we propose a general framework that introduces the idea of mid-level representation between audiovisual features and semantic concepts. Two types of techniques, i.e. statistical pattern recognition and rule-based decision, are combined to facilitate narrowing the semantic gap. We develop three systems that respectively conduct semantic concept detection in action movies, in broadcasting baseball games, and in sports videos. In action movies, we detect semantic concepts, such as gunplay and car-chasing scenes, through analyzing aural information. Statistical approaches are exploited to characterize concept modeling and to facilitate mapping between different semantic granularities. In baseball games, visual and speech information are combined, and a hybrid method that includes rule-based and statistical techniques is designed for semantic concept detection. Thirteen semantic concepts, such as single, double, homerun, and strikeout, are explicitly detected, and several realistic applications can therefore be built. In general sports videos, we extract the ball trajectory to be a new type of metadata for describing content characteristics. Some novel semantic concepts, such as pitch types in baseball games, can therefore be modeled and detected. These studies are the instances of the proposed general framework and demonstrate the realization of automatic semantic concept detection. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-13T06:02:32Z (GMT). No. of bitstreams: 1 ntu-95-D91922016-1.pdf: 2367078 bytes, checksum: 5f4f4f05bc7e31fed34ab9c9283e76dc (MD5) Previous issue date: 2006 | en |
| dc.description.tableofcontents | 致謝 i
Abstract iii 中文摘要 iv List of Figures x List of Tables xiii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Related Works 2 1.2.1 Categorize by Modality 2 1.2.2 Categorize by Level of Analysis 3 1.2.3 Categorize by Processing Methods 4 1.2.4 Concerns from International Standards 4 1.3 Semantic Concept Detection 6 1.3.1 From Feature to Knowledge 6 1.3.2 Pattern Recognition vs. Semantic Concept Detection 8 1.4 Problem Statement 10 1.5 Summary of Contributions 10 1.5.1 Audio Semantic Concept Detection in Movies 10 1.5.2 Explicit Baseball Concept Detection 11 1.5.3 Trajectory-Based Analysis in Baseball Videos 11 1.6 Dissertation Organization 12 Chapter 2 A Unified Framework for Multimedia Semantic Analysis 13 2.1 Content Analysis and Concept Language 13 2.2 Content Chain Framework 14 2.2.1 Framework Overview 14 2.2.2 Deterministic Mapping Function 16 2.2.3 Nondeterministic Mapping Function 16 2.2.4 Generality of the Content Chain Framework 16 2.3 Framework Correspondence 18 2.3.1 Semantic Concept Detection in Movies 18 2.3.2 Semantic Concept Detection in Baseball Videos 19 2.3.3 Trajectory-based Analysis in Sports Videos 20 2.4 Summary 21 Chapter 3 Semantic Analysis in Movies through Audio Information 23 3.1 Introduction 23 3.2 Hierarchical Audio Models 24 3.2.1 Audio Event and Semantic Concept 25 3.2.2 Hierarchical Framework 26 3.3 Audio Feature Extraction 27 3.3.1 Short-Time Energy 27 3.3.2 Band Energy Ratio 28 3.3.3 Zero-Crossing Rate 28 3.3.4 Frequency Centroid 29 3.3.5 Bandwidth 29 3.3.6 Mel-Frequency Cepstral Coefficients 29 3.4 Audio Event Modeling 30 3.4.1 Model Size Estimation 30 3.4.2 Model Training 31 3.4.3 Specific and World Distribution 32 3.4.4 Pseudo-Semantic Features 33 3.5 Generative Modeling for Semantic Concept 35 3.5.1 Model Training 36 3.5.2 Semantic Concept Detection 36 3.6 Discriminative Modeling for Semantic Concept 36 3.6.1 Model Training 37 3.6.2 Semantic Concept Detection 38 3.7 Performance Evaluation 38 3.7.1 Evaluation of Audio Event Detection 39 3.7.1.1 Overall Performance 40 3.7.1.2 Performance Comparison 41 3.7.2 Evaluation of Semantic Concept Detection 42 3.7.3 Comparison with Baseline System 44 3.7.4 Discussion 46 3.7.5 Semantic Indexing Based on the Proposed Framework 46 3.8 Summary 47 Chapter 4 Semantic Analysis and Game Abstraction in Baseball Videos 49 4.1 Introduction 49 4.2 System Framework 51 4.2.1 Characteristics of Baseball Games 51 4.2.2 Overview of System Framework 52 4.3 Shot Classification 53 4.3.1 Procedure of Shot Classification 53 4.3.2 Adaptive Field Color Determination 54 4.3.3 Infield/Outfield Classification 55 4.3.4 Pitch Shot Detection 55 4.4 Concept Detection 56 4.4.1 Rule-based Concept Detection 56 4.4.1.1 Caption Feature Extraction 57 4.4.1.2 Feature Filtering 58 4.4.1.3 Concept Identification 59 4.4.2 Model-based Concept Detection 61 4.4.2.1 Shot Context Features 62 4.4.2.2 Modeling 63 4.4.3 Combine Visual Cues with Speech Information 63 4.4.3.1 Overview 63 4.4.3.2 Information Fusion 65 4.4.4 Results of Concept Detection 67 4.5 Extended Applications 71 4.5.1 Automatic Game Summarization 71 4.5.1.1 Significance Degree of Concepts 72 4.5.1.2 Selection of Summarization 72 4.5.1.3 Evaluation of Summarization 74 4.5.2 Automatic Highlight Generation 75 4.5.2.1 Significance Degree of Concepts 75 4.5.2.2 Highlight Selection Algorithm 77 4.5.2.3 Evaluation of Highlight 78 4.5.3 An Integrated Baseball System 80 4.6 Discussion and Summary 82 Chapter 5 Semantic Analysis in Sports Videos through Ball Trajectory 85 5.1 Introduction 85 5.2 System Overview 86 5.3 Ball Candidate Detection 87 5.4 Trajectory Forming Process 89 5.4.1 Trajectory Segments Generation 90 5.4.2 Trajectory Candidates Generation 92 5.4.3 Physical Model-Based Trajectory Validation 93 5.4.3.1 Physical Model of Ball Trajectory 93 5.4.3.2 Trajectory Validation via Physical Limitation 96 5.5 Trajectory-based Analysis in Different Sports 97 5.5.1 Pitch Type Recognition in Baseball Videos 97 5.5.1.1 Pitch Type Recognition 98 5.5.1.2 Evaluation of Trajectory Extraction 101 5.5.1.3 Evaluation of Pitch Type Recognition 102 5.5.2 Penalty Kick Analysis in Soccer Videos 103 5.5.2.1 Soccer Trajectory Extraction 103 5.5.2.2 Evaluation of Soccer Trajectory Extraction 105 5.5.3 Tactics Analysis in Tennis Videos 105 5.5.3.1 Tennis Trajectory Extraction 105 5.5.3.2 Evaluation of Tennis Trajectory Extraction 106 5.6 Discussion and Summary 107 Chapter 6 Future Research and Conclusions 109 6.1 Discussions 109 6.1.1 Content Adaptation Architecture 109 6.1.2 Content Adaptation Modeling 110 6.2 Future Research 112 6.3 Conclusions 113 Appendix A Hidden Markov Model 115 A.1 Specification 115 A.2 Inside HMM 116 A.2.1 Solution to the Evaluate Problem — The Forward Algorithm 117 A.2.2 Solution to the Decoding Problem — The Vertibi Algorithm 118 A.2.3 Solution to the Learning Problem — Baum-Welch Algorithm 119 Appendix B Support Vector Machine 120 B.1 Introduction 120 B.2 Training and Testing 121 B.3 Multiclass SVM 122 Appendix C Computational Media Aesthetics 124 C.1 Film Grammar 124 C.2 Computational Media Aesthetics (CMA) 124 C.3 Examples of CMA Applications 126 C.3.1 Formulating Film Tempo [Dora02] 126 C.3.2 Horror Film Genre Typing and Scene Labeling via Audio Analysis [Monc03] 126 C.3.3 Pivot Vector Space Approach for Audio-Video Mixing [Mulh03] 126 C.4 Semantic Indexing vs. CMA 127 References 129 Curriculum Vitae 141 | |
| dc.language.iso | en | |
| dc.subject | 事件與概念偵測 | zh_TW |
| dc.subject | 語意分析 | zh_TW |
| dc.subject | 影片分析與組織 | zh_TW |
| dc.subject | 視訊檢索 | zh_TW |
| dc.subject | video indexing | en |
| dc.subject | semantic analysis | en |
| dc.subject | event and concept detection | en |
| dc.subject | video analysis and organization | en |
| dc.title | 具語意基礎之電影與運動影片內容分析及組織 | zh_TW |
| dc.title | Semantics-based Content Analysis and Organization in Movies and Sports Videos | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 94-2 | |
| dc.description.degree | 博士 | |
| dc.contributor.oralexamcommittee | 李素瑛(Suh-Yin Lee),杭學鳴(Hsueh-Ming Hang),陳銘憲(Ming-Syan Chen),陳良弼(Arbee L.P. Chen),鍾國亮(Kuo-Liang Chung),許聞廉(Wen-Lian Hsu) | |
| dc.subject.keyword | 語意分析,影片分析與組織,事件與概念偵測,視訊檢索, | zh_TW |
| dc.subject.keyword | semantic analysis,video analysis and organization,event and concept detection,video indexing, | en |
| dc.relation.page | 139 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2006-06-21 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-95-1.pdf 未授權公開取用 | 2.31 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
