Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38888
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳信希
dc.contributor.authorYu-Jen Chengen
dc.contributor.author鄭友仁zh_TW
dc.date.accessioned2021-06-13T16:50:54Z-
dc.date.available2005-07-04
dc.date.copyright2005-07-04
dc.date.issued2005
dc.date.submitted2005-06-22
dc.identifier.citation[1] G. Ahanger. Techniques for automatic digital video composition. PhD thesis, College of Engineering, Boston university.
[2] C. Becchetti and L. R. Ricotti. Speech recognition, John Wiley & Sons, 1999.
[3] D. Beeferman, A. Berger, and J. Lafferty. “Statistical models of text segmentation.” Machine Learning, vol. 34, no. 1-3, pp. 177-210, 1999.
[4] A. D. Bimbo. Visual information retrieval, Morgan Kaufmann, 1999.
[5] Y. L. Chang, W. Zeng, I. Kamel, and R. Alonso. “Integrated image and speech analysis for content-based video indexing.” Proceedings of IEEE International Conference on Multimedia Computing and Systems, pp. 306-313, Hiroshima, Japan, June, 1996.
[6] H. H. Chen and S. J. Yan. “Dealing with very long Chinese sentences in a robust parsing system.” Proceedings of National Science Council, Part A: Physical Science and Engineering, vol. 19, no. 5, pp. 398-407, 1995.
[7] M.G. Christel, M. A. Smith, C. R. Taylor and D. B, Winkler. “Evolving video skims into useful multimedia abstractions.” Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 171-178, Los Angeles, United States, 1998.
[8] J. M. Corridoni, A. Del Bimbo, D. Lucarella, and H. Wenxue. “Multi-perspective navigation of movies.” Journal of Visual Languages and Computing, vol. 7, no. 4, pp. 445-466, 1996.
[9] D. DeMenthon, V. Kobla, and D. Doermann. “Video summarization by curve simplification.” Proceedings of the 6th ACM international conference on Multimedia, pp. 211-218, Bristol, Great Britain, 1998.
[10] Y. Gong and X. Liu. “Video summarization using singular value decomposition” Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 174-180, Hilton Head Island, United States, 2000.
[11] Y. Gong and X. Liu. “Video summarization with minimal visual content redundancies.” Proceedings of 2001 IEEE International Conference on Image Processing, vol. 3, pp. 362-365, Thessaloniki, Greece, October, 2001.
[12] A. Hanjalic, R. L. Lagendijk, and J. Biemond. “Automated high-level movie segmentation for advanced video-retrieval systems.” IEEE Transactions on Circuits and Systems for Video Technology, vol. 9, no. 4, pp. 580-588, June, 1999.
[13] A. Hanjalic and H. J. Zhang. “An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis.” IEEE transactions of Circuits and System for Video Technology, vol. 9, no. 8, pp. 1280-1289, 1999.
[14] C. N. Li and S. A. Thompson. Mandarin Chinese – a functional reference grammar, Berkeley: University of California press, 1981.
[15] Y. Li, T. Zhang and D. Tretter. “An overview of video abstracting techniques.” HP Laboratory Technical Report HPL-2001-191, 2001.
[16] R. Lienhart, S. Pfeiffer, and W. Effelsberg. “Video abstracting.” Communications of the ACM, vol. 40, no. 11, pp. 55-62, 1997.
[17] S. Lu, I. King, and M. R. Lyu. “Video summarization by video structure analysis and graph optimization.” Proceedings of 2004 IEEE International Conference on Multimedia and Expo, vol. 3, pp. 1959-1962, Taipei, Taiwan, June, 2004.
[18] S. Lu, I. King, and M. R. Lyu. “A novel video summarization framework for document preparation and archival applications.” Proceedings of 2005 IEEE Aerospace Conference, pp. 1-10, Montana, March, 2005.
[19] Y. F. Ma, L. Lu, H. J. Zhang, and M. Li. “A user attention model for video summarization.” Proceedings of the 10th ACM International conference on Multimedia, pp. 533-542, Juan-les-Pins, France, December, 2002.
[20] K. Nagao, S. Ohira, and M. Yoneoka. “Annotation-based multimedia summarization and translation.” Proceedings of the 19th International Conference on Computational Linguistics, vol. 2, pp. 702-708, Taipei, Taiwan, August 2002.
[21] J. Nam and A. H. Tewfik. “Dynamic video summarization and visualization.” Proceedings of the 7th ACM International conference on Multimedia, pp. 53-56, Orlando, United States, 1999.
[22] C. W. Ngo, Y. F. Ma, and H. J. Zhang. “Automatic video summarization by graph modeling.” Proceedings of the 9th IEEE International Conference on Computer Vision, vol. 1, pp. 104-109, Nice, France, October, 2003.
[23] J. Y. Pan, H. Yang and C. Faloutsos, “MMSS: multi-modal story-oriented video summarization.” Proceedings of the 4th IEEE International Conference on Data Mining, pp. 491-494, Brighton, United Kingdom, November, 2004.
[24] L. Pevzner and M. Hearst. “A critique and improvement of an evaluation metric for text segmentation.” Computational Linguistics, vol. 28, no. 1, pp. 19-36, 2002.
[25] S. Pfeiffer, R. Lienhart, S. Fischer, and W. Effelsberg. “Abstracting digital movies automatically.” Journal of Visual Communication and Image Representation, vol. 7, no. 4, pp. 345-353, 1996.
[26] M. A. Smith and T. Kanade. “Video skimming and characterization through the combination of image and language understanding.” IEEE International Workshop on Content-based Access of Image and Video Databases, pp. 61-70, Bombay, India January, 1998.
[27] H. Sundaram, L. Xie, and S. F. Chang. “A utility framework for the automatic generation of audio-visual skims.” Proceedings of the 10th ACM International conference on Multimedia, pp. 189-198, Juan-les-Pins, France, December, 2002.
[28] D. Swanberg, C. F. Shu, and R. Jain. “Knowledge guided parsing in video databases.” Proceedings of SPIE Storage and Retrieval for Image and Video Databases, vol. 1908, pp. 13-24, 1993.
[29] R. Tibshirani, G. Walther, and T. Hastie. “Estimating the number of clusters in a data set via the gap statistic.” Journal of the Royal Statistical Society, vol. B, no. 63, pp. 411-423, 2001.
[30] S. Uchihashi, J. Foote, A. Girgensohn and J. Boreczky. “Video manga: generating semantically meaningful video summaries.” Proceedings of the 7th ACM international conference on Multimedia, pp. 383-92, Orlando, United States, October, 1999.
[31] I. Yahiaoui, B. Merialdo, and B. Heut. “Automatic video summarization.” 2002 International Thyrrhenian workshop on digital communications, Palazzo dei Congressi, Italy, September, 2002.
[32] M. Yeung, B. L. Yeo, and B. Liu. “Extracting story units from long programs for video browsing and navigation.” Proceedings of IEEE International Conference on Multimedia Computing and Systems, pp. 296-305, Hiroshima, Japan, June, 1996.
[33] X. Zhu and X. Wu. “Sequential association mining for video summarization.” Proceedings of 2003 International Conference Multimedia and Expo, vol. 3, pp. 333-336, July, 2003.
[34] X. Zhu, X. Wu, A. K. Elmagarmid, Z. Feng and L. Wu. “Video data mining: semantic indexing and event detection from the association perspective.” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 5, pp. 665-677, 2005.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38888-
dc.description.abstract自動影像摘要近年來引起廣泛的討論。相關研究可細分為兩類,一為以靜態圖片為基礎之影像摘要,一為動態影片之影像摘要。近年來因電腦計算能力及媒體儲存容量快速增長,動態影片摘要可快速產生。然而先前並無劇本導向影像摘要之研究及多部影片摘要之相關實驗。本論文將探討針對使用者需求所產生之多部影片影像摘要。
我們採用語言方面之資訊和場景變化之資訊把影片分段。接著資訊檢索系統根據使用者需求找出相關之影片片段。影片sub-shot分群之結果用來衡量一影片片段視覺上新穎之程度。而結合這兩種分數可使我們選擇資訊上相關及視覺上生動的影片。
為了達到視覺上的平順,片段的重新排列亦被我們所考慮。我們分析了每個片段的影像內容。根據導演的節奏及一些經驗法則,我們提出了一可達到視覺平順之演算法。在實驗中證明在我們提出的四個演算法中,此種方法最能有效達到視覺平順,而不失去相關的資訊。
zh_TW
dc.description.abstractAutomatic video summarization methods have attracted research attentions for a long time. Previous works can be classified into two categories: keyframe-based video summarization and dynamic video summarization. Recently the rapid growth of computing power and storage capacity make it possible to generate dynamic video summaries much faster. However there is no previous work on generating video summaries according to specific user information needs and experiments on a multi-video environment. In this thesis we will explore the problem of script-based video summarization, in which the information needs are contained in a user script.
We first use linguistic information and shot boundary detection results to divide videos into segments, which are the foundation stones of building the summary. Then information retrieval system retrieves relevant segments using the user script as queries, and captions of the segments as documents. After sub-shot clustering, visual importance scores are evaluated for each segment based on the clustering results of its constituent sub-shots. The relevant score and the visual importance score are combined to select both informative and vivid segments.
To achieve better coherence, segment re-ordering is applied. We analyze the audio and video content, finding the editing rhythm and editing heuristics, and then develop an algorithm for visual coherence. Experiments show that this algorithm has better coherence compared with other text-based algorithm, without loss of informativeness.
en
dc.description.provenanceMade available in DSpace on 2021-06-13T16:50:54Z (GMT). No. of bitstreams: 1
ntu-94-R92922101-1.pdf: 642722 bytes, checksum: 7ce63525ae10d4ea572c3a2f91ab7a82 (MD5)
Previous issue date: 2005
en
dc.description.tableofcontentsTable of Contents
Chapter 1 Introduction 1
1.1. Motivation 1
1.2. Related works 1
1.3. System architecture 4
1.3. Organization of the thesis 6
Chapter 2 Video Segmentation 7
2.1. Introduction 7
2.2. Video preprocessing 8
2.3. Shot boundary detection 8
2.4. Linking elements in Mandarin Chinese 9
2.5. The combining algorithm 12
2.6. Evaluation 14
2.6.1. Golden standard 14
2.6.2. Evaluation metric 14
2.6.3. Experimant results 16
2.6.4. Discussion 19
2.7. Conclusion 21
Chapter 3 Segment Selection 23
3.1. Text-relevant summaries 23
3.2. Visual-rich summaries 24
3.2.1. Introduction 24
3.2.2. Frame sampling 24
3.2.3. Sub-shot definition 24
3.2.4. Keyframe selection 26
3.2.5. Sub-shot clustering 26
3.2.6. Clustering result analysis 27
3.2.7. Visual importance score 30
3.3. Conclusion 31
Chapter 4 Video Summary Coherence 33
4.1. Introduction 33
4.2. Text-coherent summarization 34
4.3. Visual-coherent summarization 35
4.3.1. Visual dynamics modeling 35
4.3.2. Audio dynamics modeling 35
4.3.3. Editing rhythm mining 35
4.3.4. Algorithm 37
4.4. Experiment setup 38
4.5. Evaluation 40
4.6. Discussion 43
Chapter 5 Conclusion and Future Works 45
5.1. Conclusion 45
5.2. Future works 46
dc.language.isoen
dc.subject影像摘要zh_TW
dc.subjectVideoen
dc.subjectSummarizationen
dc.title劇本導向影像摘要方法之研究zh_TW
dc.titleScript-based Multi-video Summarizationen
dc.typeThesis
dc.date.schoolyear93-2
dc.description.degree碩士
dc.contributor.oralexamcommittee簡立峰,柯淑津,陳光華
dc.subject.keyword影像摘要,zh_TW
dc.subject.keywordVideo,Summarization,en
dc.relation.page50
dc.rights.note有償授權
dc.date.accepted2005-06-23
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-94-1.pdf
  未授權公開取用
627.66 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved