利用語音進行照片中人物影像的自動化標註及檢索

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52671

Title:	利用語音進行照片中人物影像的自動化標註及檢索 Automatic Facial Image Annotation and Retrieval by Integrating Voice Label and Visual Appearance
Authors:	Hong-Wun Jheng 鄭宏文
Advisor:	徐宏民
Keyword:	照片標註,語音檢索, Photo Annotation,Speech Retrieval,
Publication Year :	2015
Degree:	碩士
Abstract:	Annotation is important for managing and retrieving a large amount of photos, but it is generally labor-intensive and time-consuming. However, speaking while taking photos is straightforward and effortless, and using voice for annotation is faster than typing words. To best reduce the manual cost of annotating photos, we propose a novel framework which utilizes the scarce spoken annotations recorded while capturing as voice labels and automatically label every facial image in the photo collection. To accomplish this goal, we employ a probabilistic graphical model which integrates voice labels and visual appearances for inference. Combined with group prior estimation and gender attribute association, we can achieve an outstanding performance on the proposed synthesized group photo collections.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52671
Fulltext Rights:	有償授權
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-104-1.pdf Restricted Access	1.57 MB	Adobe PDF

DSpace JSPUI