基於模擬退火法之漫畫風格相簿排版

Li-Jung Chiu; 邱立榕

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32721

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳炳宇(Bing-Yu Chen)
dc.contributor.author	Li-Jung Chiu	en
dc.contributor.author	邱立榕	zh_TW
dc.date.accessioned	2021-06-13T04:14:09Z	-
dc.date.available	2008-07-27
dc.date.copyright	2006-07-27
dc.date.issued	2006
dc.date.submitted	2006-07-24
dc.identifier.citation	[1] Appan, P., Sundaram, H., and Birchfield, D. Communicating everyday experiences. In Proceedings of the 1st ACM workshop on Story representation, mechanism and context (SRMC) 2004, 17 - 24. ACM Press, 2004. [2] Barnard, K., Duygulu, P., and Forsyth, D. Clustering art. In Proceedings of the 2001 IEEE Computer Society Conference on Pattern Recognition, 2001. [3] Blei, D. and Jordan, M. Modeling annotated data. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR) 2003, 127 - 134. ACM Press, 2003. [4] Chang, E., et al., Cbsa: Content-based soft annotation for multimodal image retrieval using bayes point machines. CirSysVideo, 2003. 13(1): pp. 26-38. [5] Chu, L. L., Balabanovic, M., and Wolff, G. J. Storytelling with digital photographs. . In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI) 2000, 1404 - 1406. ACM Press, 2000. [6] Cusano, C., Ciocca, G., and Schettini, R. Image annotation using SVM. In Proceedings of Internet Imaging IV, Vol. SPIE 5304, 2004. [7] Davis, M. Mobile media metadata: metadata creation system for mobile images. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 936 – 937. ACM Press, 2004. [8] Davis, M., Good, N., and Sarvas, R. From context to content: leveraging context for mobile media metadata. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 188 – 195. ACM Press, 2004. [9] Duygulu, P., et al. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In Proceedings of 7th European Conference on Computer Vision, 2002, IV: 97-112. [10] Geigel, J. and Loui, A. Using Genetic algorithms for album page layouts. IEEE Multimedia,Volume: 10 , Issue: 4 , Oct-Dec 2003:16 - 27. [11] Gemmell, J., Aris, A., and Lueder, R. Telling stories with mylifebits. In Proceedings of the IEEE International Conference on Multimedia & Expo (ICME) 2005. [12] Gemmell, J., et. al. MyLifeBits: fulfilling the Memex vision. In Proceedings of ACM International Conference on Multimedia (MM) 2002, 235 - 238. ACM Press, 2002. [13] Gonzalez, J. and Merelo, J. J. Optimizing web page layout using an annealed genetic algorithm as client-side script. In Proceedings of the 5th International Conference on Parallel Problem Solving from Nature, 1018 - 1027. ACM Press 1998. [14] Gonzalez, J, et. al. Web newspaper layout optimization using simulated annealing. IEEE Transactions on Systems, Man, and Cybernetics, Part B 32(5): 686-691, 2002 [15] Harada, S., Naaman, M., Song, Y. J., et. al. Lost in memories: interacting with photo collections on PDAs. In Proceedings of the 2004 Joint ACE/IEEE Conference on Digital Libraries, 325-333. JDCL 2004. [16] Hu, Y., Rajan, D., and Chia, L. T. Robust subspace analysis for detecting visual attention regions in images. In Proceedings of ACM International Conference on Multimedia (MM) 2005, 716 - 724. ACM Press, 2005. [17] Jeon, J., Lavrenko, V., and Manmatha, R. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval 2003, 119 - 126. ACM Press, 2003. [18] Jin, R., Chai, J. Y., and Si, L. Effective automatic image annotation via a coherent language model and active learning. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 892-899. ACM Press, 2004. [19] Lavrenko, V., Manmatha, R., and Jeon, J. A Model for learning the semantics of pictures. In Proceedings of Advance in Neutral Information Processing, 2003. [20] Li, J. and Wang, J.Z. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEETrans. on Pattern Analysis and Machine Intelligence, 2003. 25(19): p. 1075-1088. [21] Ma, W. Y., et. al. Hierarchical clustering of WWW image search results using visual, textual and link information. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 952 - 959. ACM Press, 2004. [22] Marques, O., and Barman, N. Semi-automatic semantic annotation of images using machine learning techniques. In Proceedings of the 2nd International Semantic Web Conference, 550-565. ISWC 2003. [23] Moghaddam, B., et. al. Visualization and layout for personal photo libraries. International Workshop on Content-Based Multimedia Indexing (CBMI) 2001, September 2001. [24] Monay, F. and Gatica-Perez, D. On Image auto-annotation with latent space models. In Proceedings of ACM International Conference on Multimedia (MM) 2003, 275 - 278. ACM Press, 2003. [25] Mori, Y., Takahashi, H., and Oka, R. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999. [26] Naaman, M., Harada, S., and Wang, Q. Y. Context data in geo-referenced digital photo collections. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 196 - 203. ACM Press, 2004. [27] Rodden, K., and Wood, K. How do people manage their digital photographs? In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI) 2003, 409 - 416. ACM Press, 2003. [28] Shen, C., Lesh, N., and Bardsley, R. S. Personal digital historian: user interface design. In. Proceedings of Extended Abstracts of Human Factors in Computing Systems (CHI) 2001, 378-385. ACM Press, 2001. [29] Shen, C., Lesh, N., and Vernier, F. Personal digital historian: story sharing around the table. Interactions 10(2): 15-22 (2003). ACM Press, 2003. [30] Singh, P., and Barry, B. Collecting commonsense experiences. In Proceedings of the 2nd international conference on Knowledge capture (K-CAP) 2003, 154 - 161. ACM Press, 2003. [31] Singh, R., et. al. Designing experiential environment for management of personal multimedia. In Proceedings of ACM International Conference on Multimedia (MM) 2004, 496 - 499. ACM Press, 2004. [32] Valle, C., Schafer, L., and Prinz, W. Group storytelling for team awareness and entertainment. In Proceedings of the third Nordic conference on Human-computer interaction (NordiCHI) 2004, 441- 444. ACM Press, 2004. [33] Vronay, D., Farnham, S., and Davis J. PhotoStory: preserving emotion in digital photo sharing. Virtual Worlds Group Internal Paper. Microsoft Research, 2001. [34] Wilhelm, A., Takhteyev, Y., Sarvas, R., et. al. Photo annotation on a camera phone. In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI) 2004, 1404 - 1406. ACM Press, 2004. [35] Zabih, R., and Pass, G. Histogram refinement for content-based image retrieval. In Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision 1996, 96 - 102.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32721	-
dc.description.abstract	本研究的重點是呈現一套系統, 使用的演算法可以產生具有漫畫風格的照片排版 (comic-styled photo layout), 其中包括將將多張照片放在同一頁 (layouting) 、將照片依需要適時地裁切 (crop)、對話泡泡 (speech bubbles) 的放置。本系統所實作出的編輯層 (authoring layer) 可以讓使用者很簡便地自行輸入註解對話 (annotations), 排版層 (layout generation layer) 可以將照片適時擺放在不同大小的輸出版面。所提出的演算法使用了幾項技術, 包括人臉偵測 (face detection)、重點區域偵測 (region of interest detection)、對話泡泡放置區域之偵測 (speech bubble placement detection) 及退火演算法 (simulated annealing)。我們也定義了不同排版結果間的距離函式及用來做排版最佳化的目標函式。因此, 我們可以對使用者想要排版的照片們做最佳化, 包括整體性(integrity), 強調性(emphasis), 及一致性(unity)。在我們的實驗中, 在兩種不同的紙張大小裡使用三組照片資料作實驗, 裡面包括中文及英文的註解對話。另一方面, 我們也拿現在市場上能使用的其他六種相簿解決方案來做比較, 評估的量尺有使用方便度 (ease of use)、照片故事清楚度 (clarity of the photo story) 及排版結果有趣度 (interesting-ness)。	zh_TW
dc.description.abstract	The research presents a system and an algorithm for producing comic-styled photo layout that includes layouting the photos on a page, cropping the photos and placing the speech bubbles onto the photo. The system is comprised of an authoring layer that allows the user to key-in the annotations easily and a layout generation layer that automatically lays outs the photos on papers of varying sizes. The algorithm employs different techniques that include: face detection, region of Interest (ROI) detection, speech bubble placement detection and Simulated Annealing. Distance between layouts is defined by the research as well as the objective function that optimizes the integrity, emphasis and unity of the set of photos provided by the user. Three sets of data with two languages on two different paper sizes were explored as the test bed of this research. Six other market-available solutions were benchmarked against the presented thesis on the area of ease of use, clarity of the photo story, and the interesting-ness of the layout (how interesting the layout is).	en
dc.description.provenance	Made available in DSpace on 2021-06-13T04:14:09Z (GMT). No. of bitstreams: 1 ntu-95-R93725050-1.pdf: 67483060 bytes, checksum: 0cf92e713f5292c67f96f916ef8b701b (MD5) Previous issue date: 2006	en
dc.description.tableofcontents	1.0 Research Description 1 1.1 Overview of Current State of Technology 1 1.2 Research Objectives 2 1.2.1 General Objective 2 1.2.2 Specific Objectives 2 1.2 Scope and Limitations of the Research 3 1.3 Significance of the Research 3 2.0 Review of Related Literature 4 2.1 Photo Information Acquisition 4 2.1.1 Embedded Information Extraction 4 2.1.2 Assistive Tool for Photo Story Sharing 6 2.1.2.1 Face-to-face 6 2.1.2.2 Human-computer-human 11 2.1.2.2.1 Sharing of Photo Story 11 2.1.2.2.2 Photo Story Authoring 12 2.2 Photo Layout Optimization 19 2.2.1 Genetic Algorithm 19 2.2.2 Simulated Annealing 22 3.0 System Overview 24 3.1 Authoring Layer 24 3.2 Presentation Layer 25 3.2.1 Bubble placement 25 3.2.2 Automatic Cropping 25 3.2.3 Page Layout 25 4.0 Framework and Algorithm 27 4.1 Authoring 27 4.2 Cropping Area Detection 28 4.2.1 ROI Detection 28 4.2.1.1 Polar Transformation of Features 28 4.2.1.2 Obtaining Subspaces 28 4.2.1.3 Obtaining the Attention Score of the Subspaces 29 4.2.2 Bubble Placement 29 4.3 Simulate Annealing Layout Generator 31 4.3.1 Solution representation 31 4.3.2 Input 32 4.3.3 Simulated Annealing Setup 33 4.3.3.1 Neighborhood 33 4.3.3.2 Objective Function 35 4.3.3.2.1 Integrity 35 4.3.3.2.2 Emphasis 36 4.3.3.2.3 Unity 37 4.3.3.2 Annealing Schedule 37 4.3.3.2.1 Initial Temperature 37 4.3.3.2.2 Final Temperature 38 4.3.3.2.3 Freezing Function 38 4.3.3.2.4 Length of Markov Chains 38 5.0 Results 39 6.0 Evaluation 44 6.1 Method 44 6.2 Results 45 6.3 Analysis 46 7.0 Conclusion 48 Appendix A: Bibliography 49 Appendix B: Resource Persons 52 Appendix C: Curriculum Vitae 53 Appendix D: Evaluation Form 55 Appendix E: Participant’s Demographics 56 Appendix F: Result Set Two (640 x 960) 58 Appendix G: Result Set Three (640 x 960) 62 Appendix H: Result Set One in Smaller Page Size (320 x 480) 71 Appendix I: Result Set Two in Smaller Page Size (320 x 480) 73 Appendix J: Result Set Three in Smaller Page Size (320 x 480) 75
dc.language.iso	en
dc.subject	看圖說故事	zh_TW
dc.subject	具有漫畫風格的照片排版	zh_TW
dc.subject	退火演算法	zh_TW
dc.subject	重點區域偵測	zh_TW
dc.subject	裁切照片	zh_TW
dc.subject	對話泡泡放置	zh_TW
dc.subject	comic-styled photo layout	en
dc.subject	speech bubble placement	en
dc.subject	image crop area detection	en
dc.subject	region of interest (ROI) detection	en
dc.subject	Simulated Annealing	en
dc.subject	photo story	en
dc.title	基於模擬退火法之漫畫風格相簿排版	zh_TW
dc.title	Comic-styled Photo Album Layout Using Simulated Annealing	en
dc.type	Thesis
dc.date.schoolyear	94-2
dc.description.degree	碩士
dc.contributor.coadvisor	莊永裕(Yung-Yu Chuang)
dc.contributor.oralexamcommittee	梁容輝(Rung-Huei Liang)
dc.subject.keyword	具有漫畫風格的照片排版,看圖說故事,退火演算法,重點區域偵測,裁切照片,對話泡泡放置,	zh_TW
dc.subject.keyword	comic-styled photo layout,photo story,Simulated Annealing,region of interest (ROI) detection,image crop area detection,speech bubble placement,	en
dc.relation.page	78
dc.rights.note	有償授權
dc.date.accepted	2006-07-25
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-95-1.pdf 未授權公開取用	65.9 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。