Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
  • 幫助
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42094
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor許永真(Yung-Jen Hsu)
dc.contributor.authorChia-Chun Lianen
dc.contributor.author連家峻zh_TW
dc.date.accessioned2021-06-15T00:46:20Z-
dc.date.available2008-09-02
dc.date.copyright2008-09-02
dc.date.issued2008
dc.date.submitted2008-08-26
dc.identifier.citation[1] L. Bao and S. S. Intille. Activity recognition from user-annotated acceleration data. In Proceedings of the Second International Conference on Pervasive Computing (Pervasive 2004), 2004.
[2] S. Basu. Conversational Scene Analysis. PhD thesis, Massachusetts Institute of Technology (MIT), 2002.
[3] M. Brand, N. Oliver, and A. Pentlan. Coupled hidden markov models for complex action recognition. In Proceedings of the 1997 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 1997), 1997.
[4] O. Brdiczka, J. Maisonnasse, and P. Reignier. Automatic detection of interaction groups. In Proceedings of the 7th International Conference on Multimodal Interfaces (ICMI 2005), 2005.
[5] J. Chen, A. H. Kam, J. Zhang, N. Liu, and L. Shue. Bathroom activity monitoring based on sound. In Proceedings of the Third International Conference on Pervasive Computing
(Pervasive 2005), 2005.
[6] T. Choudhury and S. Basu. Modeling conversational dynamics as a mixed-memory markov process. In Proceedings of the Advances in Neural Information Processing Systems 17 (NIPS 2004), 2004.
[7] J. Gips and A. Pentland. Mapping human networks. In Proceedings of the 4th Annual IEEE International Conference on Pervasive Computing and Communications (PerCom 2006), 2006.
[8] S. S. Intille, K. Larson, E. M. Tapia, J. S. Beaudin, P. Kaushik, J. Nawyn, and R. Rockinson. House n placelab data set ( http://architecture.mit.edu/house n/data/placelab/placelab.htm ), 2006.
[9] S. S. Intille, K. Larson, E. M. Tapia, J. S. Beaudin, P. Kaushik, J. Nawyn, and R. Rockinson. Using a live-in laboratory for ubiquitous computing research. In Proceedings of the 4th International Conference on Pervasive Computing (Pervasive 2006), pages 349–365, 2006.
[10] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML 2001), 2001.
[11] L. Liao, D. Fox, and H. Kautz. Hierarchical conditional random fields for gps-based activity recognition. In Springer Tracts in Advanced Robotics. Springer, 2007.
[12] B. Limketkai, D. Fox, and L. Liao. Crf-filters: Discriminative particle filters for sequential state. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA 2007), 2007.
[13] Y. Liu, J. Carbonell, P. Weigele, and V. Gopalakrishnan. Segmentation conditional random fields (scrfs): A new approach for protein fold recognition. In Proceedings of the 9th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2005), 2005.
[14] B. Logan, J. Healey, M. Philipose, E. M. Tapia, and S. Intille. A long-term evaluation of sensing modalities for activity recognition. In Proceedings of the 9th International Conference on UbiComp Computing (UbiComp 2007), 2007.
[15] J. Neville and D. Jensen. Iterative classification in relational data. In Proceedings of the AAAI 2000 Workshop Learning Statistical Models from Relational Data, 2000.
[16] N. Oliver, E. Horvitz, and A. Garg. Layered representations for human activity recognition. In Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002.
[17] K. Otsuka, J. Yamato, Y. Takemae, and H. Murase. Conversation scene analysis with dynamic bayesian network basedon visual head tracking. In Proceedings of the 2006 IEEE International Conference on Multimedia and Exposiiton (ICME 2006), 2006.
[18] L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the 1989 IEEE International Conference (IEEE 1989), 1989.
[19] S. Sarawagi and W. W. Cohen. Semi-markov conditional random fields for information extraction. In Proceedings of the Advances in Neural Information Processing Systems 17 (NIPS 2004), 2004.
[20] K. Sato and Y. Sakakibara. Rna secondary structural alignment with conditional random fields. In Proceedings of the 4th European Conference on Computational Biology/Sixth Meeting of the Spanish Bioinformatics Network (ECCB/JBI 2005), 2005.
[21] P. Sen and L. Getoor. Link-based classification. Technical Report CS-TR-4858, University of Maryland, February 2007.
[22] F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL 2003), 2003.
[23] M. Shimosaka, T. Mori, and T. Sato. Robust action recognition and segmentation with multi-task conditional random fields. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA 2007), 2007.
[24] C. Sminchisescu, A. Kanaujia, and D. Metaxas. Conditional models for contextual human motion recognition. In Proceedings of the Computer Vision and Image Understanding
2006 (CVIU 2006), 2006.
[25] C. Sutton and A. McCallum. Introduction to Statistical Relational Learning, chapter 4, pages 93–126. The MIT Press, 2007.
[26] C. Sutton, A. McCallum, and K. Rohanimanesh. Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. Journal of Machine Learning Research (JMLR), 8, 2007.
[27] E. M. Tapia, S. S. Intille, and K. Larson. Activity recognition in the home using simple and ubiquitous sensors. In Proceedings of the Second International Conference on Pervasive Computing (Pervasive 2004), 2004.
[28] S. V. N. Vishwanathan, N. N. Schraudolph, M.W. Schmidt, and K. Murphy. Accelerated training of conditional random. fields with stochastic meta-descent. In Proceedings of the 23rd International Confernence on Machine Learning (ICML 2006), June 2006.
[29] C. Vogler and D. Metaxas. A framework for recognizing the simultaneous aspects of american sign language. In Proceedings of the Computer Vision and Image Understanding 2001 (CVIU 2001), 2001.
[30] H. Wallach. Efficient training of conditional random fields. In Proceedings of the 6th Annual CLUK Research Colloquium (CLUK 2003), 2003.
[31] T. Wu, C. Lian, and J. Y. Hsu. Joint recognition of multiple concurrent activities using factorial conditional random fields. In Proceedings of the AAAI 2007 Workshop Plan, Activity, and Intent Recognition (PAIR 2007), 2007.
[32] D. Wyatt, T. Choudhury, J. Bilmes, and H. Kautz. A privacy-sensitive approach to modeling multi-person conversations. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), 2007.
[33] J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, chapter 8, pages 239–269. Morgan Kaufmann, 2002.
[34] J. S. Yedidia,W. T. Freeman, and Y.Weiss.Constructing free-energy approximations and generalized belief propagation algorithms. In Proceedings of the 2005 IEEE Transactions on Information Theory, 2005.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42094-
dc.description.abstract在社交場合進行交談行為辨識 (Chatting Activity Recognition) 對於社交網路 (Social Network) 的建立來說實在是不可或缺的一環,而且在各式各樣的社交行為當中,交談行為更是一種非常明顯的指標,但是要在一個公共場所進行交談行為辨識最大的困難點在於:有多人同時進行著多重的行為,這意味著有很多對話會在同一時間點同步進行,這將嚴重混淆多重交談行為的辨識。
為了將這種同步交談行為的對話動態情形加以模型化,我提出使用「階乘式條件隨機場模型」(Factorial Conditional Random Fields) 來涵蓋多重行為狀態之間的同步關係 (Co-temporal Relationship),並且同時減少模型的複雜度;除此之外,為了避免使用效率較低的「信念傳遞演算法」(Loopy Belief Propagation Algorithm),我也提出使用「反覆分類演算法」(Iterative Classification Algorithm) 來進行「階乘式條件隨機場模型」的推論。
我設計許多實驗來比較「階乘式條件隨機場模型」和其它動態機率模型 (Dynamic Probabilistic Models) 對於音訊資料在學習 (Learning) 與解密 (Decoding) 過程上的差異,其中包括了和「平行化條件隨機場」(Parallel Conditional Random Fields) 以及一些類「隱藏式馬可夫模型」(Hidden Markov Models) 的比較。
在考慮多重同步行為的可能之下,實驗結果發現「階乘式條件隨機場模型」表現得比「平行化條件隨機場模型」以及其它類「隱藏式馬可夫模型」還要更好;我們也發現當「階乘式條件隨機場模型」搭配「反覆分類演算法」來一起使用,除了可以增加辨識的準確度之外,比起「信念傳遞演算法」來說,它還可以大幅降低學習與解密的時間。
zh_TW
dc.description.abstractRecognition of chatting activities occurring in social occasions is an important building block for constructing a human social network. Among the various types of social interactions, chatting with others is a significant indicator. However, the main challenge of chatting activity recognition in public occasions is the existence of multiple people involved in multiple activities. That is, several conversations may take place concurrently, thereby causing a great deal of confusion for the recognition of multiple chatting activities. To model the conversational dynamics of co-existing chatting behaviors, I advocate using the Factorial Conditional Random Fields (FCRFs) to accommodate co-temporal relationships among multi-activity states and to reduce model complexity. In addition, to avoid the use of the inefficient Loopy Belief Propagation (LBP) algorithm, I propose using the Iterative Classification Algorithm (ICA) as the inference method for FCRFs. We designed several experiments to compare our FCRFs model with other dynamic probabilistic models, such as the Parallel Condition Random Fields (PCRFs) and the Hidden Markov Models (HMMs), in learning and decoding based on auditory data. While considering the existence of multiple concurrent chatting activities, the experimental results show that the FCRFs models outperform the PCRFs model and other HMMs-like models. We also discover that the FCRFs model using the ICA inference approach not only improves the recognition accuracy but also takes significantly much less time to do learning and decoding processes than the LBP inference method.en
dc.description.provenanceMade available in DSpace on 2021-06-15T00:46:20Z (GMT). No. of bitstreams: 1
ntu-97-R95922047-1.pdf: 12398222 bytes, checksum: 9f9a7e51ea932a362e9a003b659612aa (MD5)
Previous issue date: 2008
en
dc.description.tableofcontentsAcknowledgments iii
Abstract v
List of Figures xiii
List of Tables xv
Chapter 1 Introduction 1
1.1 Motivation 1
1.1.1 Human Social Network Building 1
1.1.2 Automatic Conversation Detection 2
1.1.3 Common Conversational Style 3
1.2 Challenges 3
1.2.1 Multi-Tasking Problems of Chatting Activity Recognition 4
1.2.2 Insufficiency of Dynamic Bayesian Networks 4
1.2.3 Inefficiency of Probabilistic Inference Method 5
1.3 Problem Definition 5
1.3.1 Assumption 6
1.3.2 Input 6
1.3.3 Output 7
1.4 Proposed Solution 7
1.5 Thesis Organization 9
Chapter 2 Background 11
2.1 Related Work 11
2.1.1 Activity Recognition Using Ubiquitous Sensors 11
2.1.2 Activity Recognition Using Wearable Digital Sensors 12
2.1.3 Conversation Detection Using Bottom-Up Approach 13
2.1.4 Conversation Modeling Using Top-Down Approach 13
2.1.5 Multiple Sequences Labeling 15
2.2 Related Technology 16
2.2.1 Dynamic Probabilistic Models 16
2.2.2 Probabilistic Inference Algorithms 18
Chapter 3 Model Structures 25
3.1 Notation Definition 25
3.2 Parallel Conditional Random Fields 31
3.2.1 Model Design 31
3.2.2 Learning Process 35
3.2.3 Inference Process 39
3.2.4 Decoding Process 39
3.3 Factorial Conditional Random Fields 40
3.3.1 Model Design 40
3.3.2 Learning Process 43
3.3.3 Inference Process 45
3.3.4 Decoding Process 47
Chapter 4 Auditory Feature Extraction 49
4.1 Volume and Mutual Information 49
4.2 Human Voice Detection 53
4.3 Summary of Auditory Feature Values 54
Chapter 5 Experimental Design and Result 57
5.1 Experimental Design 57
5.1.1 Scenario and Data Collection 57
5.1.2 Audio Recorder 58
5.1.3 Annotation 60
5.1.4 Model Training 60
5.1.5 Performance Evaluation 65
5.2 Experimental Results 68
5.2.1 Meeting Activity 68
5.2.2 Public Occasion 69
Chapter 6 Conclusions and Future Work 75
6.1 Summary of Contributions 75
6.2 Future Work 76
Bibliography 77
dc.language.isoen
dc.title使用階乘式條件隨機場與反覆分類法進行交談行為辨識zh_TW
dc.titleChatting Activity Recognition Using Factorial Conditional Random Fields with Iterative Classificationen
dc.typeThesis
dc.date.schoolyear96-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳文進(Wen-Chin Chen),歐昱言(Yu-Yen Ou),林守德(Shou-De Lin),陳穎平(Ying-Ping Chen)
dc.subject.keyword交談行為辨識,動態機率模型,階乘式條件隨機場,信念傳遞演算法,反覆分類演算法,zh_TW
dc.subject.keywordChatting Activity Recognition,Dynamic Probabilistic Models,Factorial Conditional Random Fields,Loopy Belief Propagation Algorithm,Iterative Classification Algorithm,en
dc.relation.page82
dc.rights.note有償授權
dc.date.accepted2008-08-26
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-97-1.pdf
  目前未授權公開取用
12.11 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved