使用階乘式條件隨機場與反覆分類法進行交談行為辨識

Chia-Chun Lian; 連家峻

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42094

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	許永真(Yung-Jen Hsu)
dc.contributor.author	Chia-Chun Lian	en
dc.contributor.author	連家峻	zh_TW
dc.date.accessioned	2021-06-15T00:46:20Z	-
dc.date.available	2008-09-02
dc.date.copyright	2008-09-02
dc.date.issued	2008
dc.date.submitted	2008-08-26
dc.identifier.citation	[1] L. Bao and S. S. Intille. Activity recognition from user-annotated acceleration data. In Proceedings of the Second International Conference on Pervasive Computing (Pervasive 2004), 2004. [2] S. Basu. Conversational Scene Analysis. PhD thesis, Massachusetts Institute of Technology (MIT), 2002. [3] M. Brand, N. Oliver, and A. Pentlan. Coupled hidden markov models for complex action recognition. In Proceedings of the 1997 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 1997), 1997. [4] O. Brdiczka, J. Maisonnasse, and P. Reignier. Automatic detection of interaction groups. In Proceedings of the 7th International Conference on Multimodal Interfaces (ICMI 2005), 2005. [5] J. Chen, A. H. Kam, J. Zhang, N. Liu, and L. Shue. Bathroom activity monitoring based on sound. In Proceedings of the Third International Conference on Pervasive Computing (Pervasive 2005), 2005. [6] T. Choudhury and S. Basu. Modeling conversational dynamics as a mixed-memory markov process. In Proceedings of the Advances in Neural Information Processing Systems 17 (NIPS 2004), 2004. [7] J. Gips and A. Pentland. Mapping human networks. In Proceedings of the 4th Annual IEEE International Conference on Pervasive Computing and Communications (PerCom 2006), 2006. [8] S. S. Intille, K. Larson, E. M. Tapia, J. S. Beaudin, P. Kaushik, J. Nawyn, and R. Rockinson. House n placelab data set ( http://architecture.mit.edu/house n/data/placelab/placelab.htm ), 2006. [9] S. S. Intille, K. Larson, E. M. Tapia, J. S. Beaudin, P. Kaushik, J. Nawyn, and R. Rockinson. Using a live-in laboratory for ubiquitous computing research. In Proceedings of the 4th International Conference on Pervasive Computing (Pervasive 2006), pages 349–365, 2006. [10] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML 2001), 2001. [11] L. Liao, D. Fox, and H. Kautz. Hierarchical conditional random fields for gps-based activity recognition. In Springer Tracts in Advanced Robotics. Springer, 2007. [12] B. Limketkai, D. Fox, and L. Liao. Crf-filters: Discriminative particle filters for sequential state. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA 2007), 2007. [13] Y. Liu, J. Carbonell, P. Weigele, and V. Gopalakrishnan. Segmentation conditional random fields (scrfs): A new approach for protein fold recognition. In Proceedings of the 9th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2005), 2005. [14] B. Logan, J. Healey, M. Philipose, E. M. Tapia, and S. Intille. A long-term evaluation of sensing modalities for activity recognition. In Proceedings of the 9th International Conference on UbiComp Computing (UbiComp 2007), 2007. [15] J. Neville and D. Jensen. Iterative classification in relational data. In Proceedings of the AAAI 2000 Workshop Learning Statistical Models from Relational Data, 2000. [16] N. Oliver, E. Horvitz, and A. Garg. Layered representations for human activity recognition. In Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002. [17] K. Otsuka, J. Yamato, Y. Takemae, and H. Murase. Conversation scene analysis with dynamic bayesian network basedon visual head tracking. In Proceedings of the 2006 IEEE International Conference on Multimedia and Exposiiton (ICME 2006), 2006. [18] L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the 1989 IEEE International Conference (IEEE 1989), 1989. [19] S. Sarawagi and W. W. Cohen. Semi-markov conditional random fields for information extraction. In Proceedings of the Advances in Neural Information Processing Systems 17 (NIPS 2004), 2004. [20] K. Sato and Y. Sakakibara. Rna secondary structural alignment with conditional random fields. In Proceedings of the 4th European Conference on Computational Biology/Sixth Meeting of the Spanish Bioinformatics Network (ECCB/JBI 2005), 2005. [21] P. Sen and L. Getoor. Link-based classification. Technical Report CS-TR-4858, University of Maryland, February 2007. [22] F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL 2003), 2003. [23] M. Shimosaka, T. Mori, and T. Sato. Robust action recognition and segmentation with multi-task conditional random fields. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA 2007), 2007. [24] C. Sminchisescu, A. Kanaujia, and D. Metaxas. Conditional models for contextual human motion recognition. In Proceedings of the Computer Vision and Image Understanding 2006 (CVIU 2006), 2006. [25] C. Sutton and A. McCallum. Introduction to Statistical Relational Learning, chapter 4, pages 93–126. The MIT Press, 2007. [26] C. Sutton, A. McCallum, and K. Rohanimanesh. Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. Journal of Machine Learning Research (JMLR), 8, 2007. [27] E. M. Tapia, S. S. Intille, and K. Larson. Activity recognition in the home using simple and ubiquitous sensors. In Proceedings of the Second International Conference on Pervasive Computing (Pervasive 2004), 2004. [28] S. V. N. Vishwanathan, N. N. Schraudolph, M.W. Schmidt, and K. Murphy. Accelerated training of conditional random. fields with stochastic meta-descent. In Proceedings of the 23rd International Confernence on Machine Learning (ICML 2006), June 2006. [29] C. Vogler and D. Metaxas. A framework for recognizing the simultaneous aspects of american sign language. In Proceedings of the Computer Vision and Image Understanding 2001 (CVIU 2001), 2001. [30] H. Wallach. Efficient training of conditional random fields. In Proceedings of the 6th Annual CLUK Research Colloquium (CLUK 2003), 2003. [31] T. Wu, C. Lian, and J. Y. Hsu. Joint recognition of multiple concurrent activities using factorial conditional random fields. In Proceedings of the AAAI 2007 Workshop Plan, Activity, and Intent Recognition (PAIR 2007), 2007. [32] D. Wyatt, T. Choudhury, J. Bilmes, and H. Kautz. A privacy-sensitive approach to modeling multi-person conversations. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), 2007. [33] J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, chapter 8, pages 239–269. Morgan Kaufmann, 2002. [34] J. S. Yedidia,W. T. Freeman, and Y.Weiss.Constructing free-energy approximations and generalized belief propagation algorithms. In Proceedings of the 2005 IEEE Transactions on Information Theory, 2005.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42094	-
dc.description.abstract	在社交場合進行交談行為辨識 (Chatting Activity Recognition) 對於社交網路 (Social Network) 的建立來說實在是不可或缺的一環，而且在各式各樣的社交行為當中，交談行為更是一種非常明顯的指標，但是要在一個公共場所進行交談行為辨識最大的困難點在於：有多人同時進行著多重的行為，這意味著有很多對話會在同一時間點同步進行，這將嚴重混淆多重交談行為的辨識。為了將這種同步交談行為的對話動態情形加以模型化，我提出使用「階乘式條件隨機場模型」(Factorial Conditional Random Fields) 來涵蓋多重行為狀態之間的同步關係 (Co-temporal Relationship)，並且同時減少模型的複雜度；除此之外，為了避免使用效率較低的「信念傳遞演算法」(Loopy Belief Propagation Algorithm)，我也提出使用「反覆分類演算法」(Iterative Classification Algorithm) 來進行「階乘式條件隨機場模型」的推論。我設計許多實驗來比較「階乘式條件隨機場模型」和其它動態機率模型 (Dynamic Probabilistic Models) 對於音訊資料在學習 (Learning) 與解密 (Decoding) 過程上的差異，其中包括了和「平行化條件隨機場」(Parallel Conditional Random Fields) 以及一些類「隱藏式馬可夫模型」(Hidden Markov Models) 的比較。在考慮多重同步行為的可能之下，實驗結果發現「階乘式條件隨機場模型」表現得比「平行化條件隨機場模型」以及其它類「隱藏式馬可夫模型」還要更好；我們也發現當「階乘式條件隨機場模型」搭配「反覆分類演算法」來一起使用，除了可以增加辨識的準確度之外，比起「信念傳遞演算法」來說，它還可以大幅降低學習與解密的時間。	zh_TW
dc.description.abstract	Recognition of chatting activities occurring in social occasions is an important building block for constructing a human social network. Among the various types of social interactions, chatting with others is a significant indicator. However, the main challenge of chatting activity recognition in public occasions is the existence of multiple people involved in multiple activities. That is, several conversations may take place concurrently, thereby causing a great deal of confusion for the recognition of multiple chatting activities. To model the conversational dynamics of co-existing chatting behaviors, I advocate using the Factorial Conditional Random Fields (FCRFs) to accommodate co-temporal relationships among multi-activity states and to reduce model complexity. In addition, to avoid the use of the inefficient Loopy Belief Propagation (LBP) algorithm, I propose using the Iterative Classification Algorithm (ICA) as the inference method for FCRFs. We designed several experiments to compare our FCRFs model with other dynamic probabilistic models, such as the Parallel Condition Random Fields (PCRFs) and the Hidden Markov Models (HMMs), in learning and decoding based on auditory data. While considering the existence of multiple concurrent chatting activities, the experimental results show that the FCRFs models outperform the PCRFs model and other HMMs-like models. We also discover that the FCRFs model using the ICA inference approach not only improves the recognition accuracy but also takes significantly much less time to do learning and decoding processes than the LBP inference method.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T00:46:20Z (GMT). No. of bitstreams: 1 ntu-97-R95922047-1.pdf: 12398222 bytes, checksum: 9f9a7e51ea932a362e9a003b659612aa (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	Acknowledgments iii Abstract v List of Figures xiii List of Tables xv Chapter 1 Introduction 1 1.1 Motivation 1 1.1.1 Human Social Network Building 1 1.1.2 Automatic Conversation Detection 2 1.1.3 Common Conversational Style 3 1.2 Challenges 3 1.2.1 Multi-Tasking Problems of Chatting Activity Recognition 4 1.2.2 Insufficiency of Dynamic Bayesian Networks 4 1.2.3 Inefficiency of Probabilistic Inference Method 5 1.3 Problem Definition 5 1.3.1 Assumption 6 1.3.2 Input 6 1.3.3 Output 7 1.4 Proposed Solution 7 1.5 Thesis Organization 9 Chapter 2 Background 11 2.1 Related Work 11 2.1.1 Activity Recognition Using Ubiquitous Sensors 11 2.1.2 Activity Recognition Using Wearable Digital Sensors 12 2.1.3 Conversation Detection Using Bottom-Up Approach 13 2.1.4 Conversation Modeling Using Top-Down Approach 13 2.1.5 Multiple Sequences Labeling 15 2.2 Related Technology 16 2.2.1 Dynamic Probabilistic Models 16 2.2.2 Probabilistic Inference Algorithms 18 Chapter 3 Model Structures 25 3.1 Notation Definition 25 3.2 Parallel Conditional Random Fields 31 3.2.1 Model Design 31 3.2.2 Learning Process 35 3.2.3 Inference Process 39 3.2.4 Decoding Process 39 3.3 Factorial Conditional Random Fields 40 3.3.1 Model Design 40 3.3.2 Learning Process 43 3.3.3 Inference Process 45 3.3.4 Decoding Process 47 Chapter 4 Auditory Feature Extraction 49 4.1 Volume and Mutual Information 49 4.2 Human Voice Detection 53 4.3 Summary of Auditory Feature Values 54 Chapter 5 Experimental Design and Result 57 5.1 Experimental Design 57 5.1.1 Scenario and Data Collection 57 5.1.2 Audio Recorder 58 5.1.3 Annotation 60 5.1.4 Model Training 60 5.1.5 Performance Evaluation 65 5.2 Experimental Results 68 5.2.1 Meeting Activity 68 5.2.2 Public Occasion 69 Chapter 6 Conclusions and Future Work 75 6.1 Summary of Contributions 75 6.2 Future Work 76 Bibliography 77
dc.language.iso	en
dc.title	使用階乘式條件隨機場與反覆分類法進行交談行為辨識	zh_TW
dc.title	Chatting Activity Recognition Using Factorial Conditional Random Fields with Iterative Classification	en
dc.type	Thesis
dc.date.schoolyear	96-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳文進(Wen-Chin Chen),歐昱言(Yu-Yen Ou),林守德(Shou-De Lin),陳穎平(Ying-Ping Chen)
dc.subject.keyword	交談行為辨識,動態機率模型,階乘式條件隨機場,信念傳遞演算法,反覆分類演算法,	zh_TW
dc.subject.keyword	Chatting Activity Recognition,Dynamic Probabilistic Models,Factorial Conditional Random Fields,Loopy Belief Propagation Algorithm,Iterative Classification Algorithm,	en
dc.relation.page	82
dc.rights.note	有償授權
dc.date.accepted	2008-08-26
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf 目前未授權公開取用	12.11 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。