電腦視覺技術於行人偵測之研究

Yu-Ting Chen; 陳昱廷

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41378

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	洪一平(Yi-Ping Hung)
dc.contributor.author	Yu-Ting Chen	en
dc.contributor.author	陳昱廷	zh_TW
dc.date.accessioned	2021-06-15T00:17:34Z	-
dc.date.available	2011-05-18
dc.date.copyright	2009-05-18
dc.date.issued	2009
dc.date.submitted	2009-05-13
dc.identifier.citation	[1] B. Babenko, P. Dollar, Z. Tu, and S. Belongie. Simultaneous learning and alignment: Multi-instance and multi-pose learning. In Workshop on Faces in Real-Life Images: Detection, Alignment, and Recognition, 2008. [2] Y.-T. Chen and C.-S. Chen. A cascade of feed-forward classifiers for fast pedestrian detection. In Asia Conference on Computer Vision, pages 905–914, 2007. [3] Y.-T. Chen and C.-S. Chen. Cascading rectangle and edge orientation features for fast pedestrian detection. In IEEE International Conference on Intelligent Information Hiding and Multimedia Signal Processing, volume 2, pages 407–410, 2007. [4] Y.-T. Chen and C.-S. Chen. Fast human detection using a novel boosted cascading structure with meta stages. IEEE Transactions on Image Processing, 17(8):1452–1464, 2008. [5] Y.-T. Chen, C.-S. Chen, and Y.-P. Hung. Integration of background modeling and object tracking. In IEEE International Conference on Multimedia and Expo, pages 757–762, 2006. [6] N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, 2000. [7] N. Dalal. Finding People in Images and Videos. PhD thesis, Institut National Polytechnique de Grenoble / INRIA Rhone-Alpes, 2006. [8] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 886–893, 2005. [9] N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. In European Conference on Computer Vision, pages 428–441, 2006. [10] A. Elgammal, D. Harwood, and L. S. Davis. Non-parametric model for background subtraction. In European Conference on Computer Vision, pages 751–767, 2000. [11] Y. Fang, K. Yamada, Y. Ninomiya, B. K. P. Horn, and I. Masaki. A shape-independent-method for pedestrian detection with far-infrared images. IEEE Transactions on Vehicular Technology, 53(6):1679–1697, 2004. [12] P. F. Felzenszwalb and D. P. Huttenlocher. Efficient matching of pictorial structures. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 2, pages 66–73, 2000. [13] D. Forsyth and M. Fleck. Body plans. In IEEE International Conference on Computer Vision and Pattern Recognition, 1997. [14] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Annals of Statistics, 28(2):337–407, 2000. [15] D. Gavrila and V. Philomin. Real-time object detection for “smart” vehicles. In IEEE International Conference on Computer Vision, volume 1, pages 87–93, 1999. [16] D. M. Gavrila. A bayesian, exemplar-based approach to hierarchical shape matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8):1408–1421, 2007. [17] D. M. Gavrila and S. Munder. Multi-cue pedestrian detection and tracking from a moving vehicle. International Journal of Computer Vision, 73(1):41–59, 2007. [18] R. C. Gonzalez and R. E. Woods. Digital Image Processing. Prentice-Hall, 2002. [19] H. Grabner and H. Bischof. On-line boosting and vision. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 260–267, 2006. [20] I. Haritaoglu, D. Harwood, and L. S. Davis. W4: Real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):809–830, 2000. [21] M. Harville. A framework for high-level feedback to adaptive, per-pixel, mixture-of-gaussian background models. In European Conference on Computer Vision, pages 543–560, 2002. [22] M. Harville, G. Gordon, and J. Woodfill. Foreground segmentation using adaptive mixture models in color and depth. In IEEE Workshop on Detection and Recognition of Events in Video, pages 3–11, 2001. [23] M. Heikkila, M. Pietikainen, and J. Heikkila. A texture-based method for detecting moving objects. In British Machine Vision Conference, pages 187–196, 2004. [24] T. Horprasert, D. Harwood, and L. S. Davis. A statistical approach for realtime robust background subtraction and shadow detection. In IEEE ICCV Frame-Rate Workshop, pages 1–19, 1999. [25] C. Huang, H. Ai, Y. Li, and S. Lao. Vector boosting for rotation invariant multiview face detection. In IEEE International Conference on Computer Vision, volume 1, pages 446–453, 2005. [26] C. R. Huang, C. S. Chen, and P. C. Chung. Contrast context histogram - a discriminating local descriptor for image matching. In IEEE International Conference on Pattern Recognition, pages 53–56, 2006. [27] M. Isard and A. Blake. Contour tracking by stochastic propagation of conditional density. In European Conference on Computer Vision, volume 1, pages 343–356, 1996. [28] M. Jones and P. Viola. Fast multi-view face detectioin. Technical Report TR2003-096, 2003. [29] J. D. Keeler, D. E. Rumelhart, and W.-K. Leow. Integrated segmentation and recognition of hand-printed numerals. In Conference on Advances in Neural Information Processing Systems, 1990. [30] K. Kim, T. H. Chalidabhongse, D. Harwood, and L. S. Davis. Real-time foreground-background segmentation using codebook model. Real-Time Imaging, 11(3):172–185, 2005. [31] I. Laptev. Improvements of object detection using boosted histograms. In British Machine Vision Conference, volume 3, pages 949–958, 2006. [32] D. S. Lee. Effective gaussian mixture learning for video background subtraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5):827–832, 2005. [33] B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with an implicit shape model. In ECCV Wrokshop on Statistical Learning in Computer Vision, pages 17–32, 2004. [34] B. Leibe, E. Seemann, and B. Schiele. Pedestrian detection in crowded scenes. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 878–885, 2005. [35] B. Leibe, K. Mikolajczyk, and B. Schiele. Segmentation based multi-cue integration for object detection. In British Machine Vision Conference, 2006. [36] K. Levi and Y. Weiss. Learning object detection from a small number of examples: the importance of good features. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 2, pages 53–60, 2004. [37] L. Li, W. Huang, I. Y. H. Gu, and Q. Tian. Statistical modeling of complex backgrounds for foreground object detection. IEEE Transactions on Image Processing, 13(11):1459–1472, 2004. [38] S. Li and Z. Zhang. Floatboost learning and statistical face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9):1112–1123, 2004. [39] R. Lienhart and J. Maydt. An extended set of haar-like features for rapid object detection. In IEEE International Conference on Image Processing, volume 1, pages 900–903, 2002. [40] Y.-Y. Lin and T.-L. Liu. Robust face detection with multi-class boosting. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 680–687, 2005. [41] Y.-Y. Lin, T.-L. Liu, and C.-S. Fuh. Fast object detection with occlusions. In European Conference on Computer Vision, pages 402–413, 2004. [42] Z. Lin, L. S. Davis, D. Doermann, and D. DeMenthon. Hierarchical parttemplate matching for human detection and segmentation. In IEEE International Conference on Computer Vision, 2007. [43] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110, 2004. [44] O. Marson and T. Lozano-Perez. A framework for multiple-instance learning. In Conference on Advances in Neural Information Processing Systems, 1998. [45] N. Martel-Brisson and A. Zaccarim. Moving cast shadow detection from a gaussian mixture shadow model. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 643–648, 2005. [46] L. Mason, J. Baxter, P. L. Bartlett, and M. Frean. Boosting algorithms as gradient descent in function space. Technical Report, RSISE, Australian National University, 1999. [47] M. Mason and Z. Duric. Using histograms to detect and track objects in color video. In Applied Imagery Pattern Recognition Workshop, pages 254–159, 2001. [48] T. Matsuyama, T. Ohya, and H. Habe. Background subtraction for nonstationary scenes. In Asian Conference on Computer Vision, pages 662–667, 2000. [49] A. S. Micilotta, E. J. Ong, and R. Bowden. Detection and tracking of humans by probabilistic body part assembly. In British Machine Vision Conference, 2005. [50] K. Mikolajczyk, C. Schmid, and A. Zisserman. Human detection based on a probabilistic assembly of robust part detectors. In European Conference on Computer Vision, number 69–82, 2004. [51] A. Mohan, C. Papageorgiou, and T. Poggio. Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(4):349–361, 2001. [52] A. Monnet, A. Mittal, N. Paragios, and V. Ramesh. Background modeling and subtraction of dynamic scenes. In IEEE International Conference on Computer Vision, pages 1305–1312, 2003. [53] V. Nair and J. J. Clark. An unsupervised, online learning framework for moving object detection. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 2, pages 317–324, 2004. [54] H. Nanda and L. Davis. Probabilistic template based pedestrian detection in infrared videos. In IEEE Intelligent Vehicle Symposium, volume 1, pages 15–20, 2002. [55] T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002. [56] J. Pang, Q. Huang, S. Jiang, and W. Gao. Pedestrian detection via logistic multiple instance boosting. In IEEE International Conference on Pattern Recognition, 2008. [57] C. Papageorgiou and T. Poggio. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33, 2000. [58] C. Papageorgiou, M. Oren, and T. Poggio. A general framework for object detection. In IEEE International Conference on Computer Vision, pages 555–562, 1998. [59] S. L. Phung and A. Bouzerdoum. Detecting people in images: an edge density approach. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 431–436, 2007. [60] S. L. Phung and A. Bouzerdoum. A new image feature for fast detection of people in images. International Journal of Information and Systems Science, 3 (3):383–391, 2007. [61] T. Poggio and K. K. Sung. Example-based learning for view-based human face detection. In Image Understanding Workshop, volume 2, pages 843–850, 1994. [62] F. Porikli. Integral histogram: a fast way to extract histograms in cartesian spaces. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 829–836, 2005. [63] C. Ridder, O. Munkelt, and H. Kirchner. Adaptive background estimation and foreground detection using kalman-filtering. In International Conference on Recent Advances in Mechatronics, pages 193–199, 1995. [64] R. Ronfard, C. Schmid, and B. Triggs. Learning to parse pictures of people. In European Conference on Computer Vision, pages 700–714, 2002. [65] R. E. Schapire and Y. Singer. Improved boosting algorithms using confidencerated predictions. Machine Learning, 37(3):297–336, 1999. [66] E. Seemann, B. Leibe, and B. Schiele. Multi-aspect detection of articulated objects. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 2, pages 1582–1588, 2006. [67] A. Shashua, Y. Gdalyahu, and G. Hayun. Pedestrian detection for driving assistance systems: Single-frame classification and system level performance. In IEEE Intelligent Vehicle Symposium, pages 1–6, 2004. [68] C. Stauffer and W. E. L. Grimson. Adaptive background mixture models for real-time tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 246–252, 1999. [69] B. Stenger, V. Ramesh, N. Paragios, F. Coetzee, and J. M. Buhmann. Topology free hidden markov models: application to background modeling. In IEEE International Conference on Computer Vision, pages 294–301, 2001. [70] F. Suard, A. Rakotomamonjy, A. Bensrhair, and A. Broggi. Pedestrian detection using infrared images and histograms of oriented gradients. In Intelligent Vehicles Symposium, pages 13–15, June 2006. [71] A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing features: Efficient boosting procedures for multiclass object detection. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 762–769, 2004. [72] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers. Wallflower: principles and practice of background maintenance. In IEEE International Conference on Computer Vision, pages 255–261, 1999. [73] A. Utsumi and N. Tetsutani. Human detection using geometrical pixel value structures. In IEEE International Conference on Automatic Face and Gesture Recognition, pages 34–39, 2002. [74] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 1, pages 511–518, 2001. [75] P. Viola, M. Jones, and D. Snow. Detecting pedestrians using patterns of motion and appearance. In IEEE International Conference on Computer Vision, volume 2, pages 734–741, 2003. [76] P. Viola, J. C. Platt, and C. Zhang. Multiple instance boosting for object detection. In Conference on Advances in Neural Information Processing Systems, 2005. [77] C. R. Wren, A. Azarbayejani, T. Darrell, and A. P. Pentland. Pfinder: realtime tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780–785, 1997. [78] B. Wu and R. Nevatia. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In IEEE International Conference on Computer Vision, volume 1, pages 90–97, 2005. [79] B. Wu, H. Ai, C. Huang, and S. Lao. Fast rotation invariant multi-view face detection based on real adaboost. In IEEE International Conference on Automatic Face and Gesture Recognition, pages 79–84, 2004. [80] X. Xu and E. Frank. Logistic regression and boosting for labeled bags of instances. In The Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2004. [81] H.-C. Zeng, S.-H. Huang, and S.-H. Lai. Real-time video surveillance based on combining foreground extraction and human detection. In Multimedia Modeling Conference, volume 4903, pages 70–79, 2008. [82] L. Zhao and C. E. Thorpe. Stereo- and neural network-based pedestrian detection. IEEE Transactions on Intelligent Transportation Systems, 1(3):148–154, 2000. [83] J. Zhong and S. Sclaroff. Segmenting foreground objects from a dynamic textured background via a robust kalman filter. In IEEE International Conference on Computer Vision, pages 44–50, 2003. [84] S. Zhou. A binary decision tree implementation of a boosted strong classifier. In IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pages 198–212, 2005. [85] Q. Zhu, M.-C. Yeh, K.-T. Cheng, and S. Avidan. Fast human detection using a cascade of histograms of oriented gradients. In IEEE International Conference on Computer Vision and Pattern Recognition, volume 2, pages 1491–1498, 2006.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41378	-
dc.description.abstract	視訊安全監控相關研究中，前景偵測與行人偵測是基礎且重要的課題。在本論文裡，我們針對背景模型學習、整體行人偵測和部件行人偵測等題目進行研究與探討。在背景模型學習方面，大多數方法都是以像素為基礎，其優點是可提供精細的前景偵測結果，但對動態背景則無法有好的描述。近年來，有一些研究使用以區塊為基礎的方法來處理動態背景的問題，但無法提供精細的前景偵測結果。有鑑於此，我們提出了一個階層式架構來整合以像素與以區塊為基礎的方法，此架構除可克服動態背景的影響，亦可提供兩階段多尺度的前景偵測結果。在整體行人偵測的研究上，我們使用異質性特徵來描述行人影像，並提出了階層式連鎖前饋分類器來學習整體行人偵測器。在我們的架構中，藉由使用Meta階層，不同階層間的資訊可以被使用，因此，偵測正確率與效率可進一步提昇。當人被遮蔽時，整體行人偵測器可能無法將人成功偵測出來。針對此一問題，我們提出了一個多類多重實例boosting來學習部件偵測器。藉由使用多重實例學習方法，訓練影像對位的問題可以被解決；且藉由特徵共用的特性，我們的方法可以學習出一個有效率的部件偵測器。此外，我們也提出一個機率模型方法來整合部件偵測結果。藉由廣泛的實驗，我們證明此方法能有效地進行人的偵測。最後，我們整合了本論文提出之背景模型技術與行人偵測技術來進行視訊安全監控。	zh_TW
dc.description.abstract	Three important research topics in visual surveillance are studied, including background modeling, holistic pedestrian detection, and part-based pedestrian detection. Most previous background modeling approaches are pixel-based, while some approaches began to study block-based representations which are more robust to non-stationary backgrounds. We propose a method that integrates block- and pixel-based approaches into a single framework. Quantitative results show that the proposed method has better classification results than existing single-level approaches. In addition, we develop a method that can detect holistic pedestrians in images. In our approach, heterogeneous features are employed for weak-learner selection, and a novel cascaded structure that exploits both the stage-wise classification information and the inter-stage cross-reference information is proposed. Experiment results show that our approach can detect pedestrians with both efficiency and accuracy. We also propose a multi-class multi-instance boosting method for effective part-based pedestrian detection in images. Training examples are represented as a set of non-aligned instances, and the alignment problem caused by human appearance variation can be handled. Our method has the feature-sharing ability in a cascaded structure for efficient detection. Experiment results demonstrate the superior performance of the proposed method. We also combine background modeling and pedestrian detection techniques for visual surveillance application.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T00:17:34Z (GMT). No. of bitstreams: 1 ntu-98-D92922013-1.pdf: 13017646 bytes, checksum: 41d8de83d99522eb2fb80791e913b3b0 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	口試委員會審定書 iii 致謝 v 摘要 vii Abstract ix List of Figures xv List of Tables xix 1 Introduction 1 1.1 Motivation 1 1.2 Overview of the Dissertation 3 1.2.1 Hierarchical Background Modeling and Foreground Detection 3 1.2.2 Holistic Pedestrian Detection 4 1.2.3 Part-Based Pedestrian Detection 5 1.3 Dissertation Organization 6 2 Hierarchical Background Modeling and Foreground Detection 9 2.1 Introduction 9 2.2 Literature Review 10 2.3 Coarse-Level Modeling 12 2.3.1 Contrast Histogram of Gray-Level Images 13 2.3.2 Contrast Histogram of Color Images 15 2.3.3 Background Modeling by Contrast Histograms 16 2.3.4 Coarse-Level Experiment Results 18 2.4 Hierarchical Background Models 19 2.4.1 A General Description of Pixel-Based Background Modeling 20 2.4.2 Asymmetric Feed-Forwarding 21 2.5 Experiment Results 23 2.5.1 Implementation 23 2.5.2 Performance Results 23 2.6 Summary 28 3 Holistic Pedestrian Detection 31 3.1 Introduction 31 3.2 Literature Review 32 3.3 Real AdaBoost and Feature Pool 35 3.3.1 Intensity-Based Features 36 3.3.2 Gradient-Based Features 37 3.3.3 Combined Feature Pool 40 3.4 Cascading Feed-Forward Classifiers 42 3.4.1 Adding Meta-Stages 45 3.4.2 Meta-Stage Classifier 47 3.4.3 General Structure with Meta-Stages 47 3.4.4 Distinction of AdaBoost Stage and Meta-stage 48 3.5 Experiment Results 51 3.5.1 Adopted Human Dataset 51 3.5.2 Implementation 51 3.5.3 Performance Results of Combining Intensity- and Gradient-Based Features 52 3.5.4 Performance of Inserting Meta-Stages 58 3.5.5 Efficiency and Accuracy Comparison with HOG 61 3.6 Visual Surveillance Application 63 3.7 Summary 71 4 Part-Based Pedestrian Detection 73 4.1 Introduction 73 4.2 Literature Review 74 4.3 Real Version MILBoost 77 4.3.1 A Review of AnyBoost 77 4.3.2 MILBoost 78 4.3.3 Real MILBoost 79 4.4 Multi-Class Multi-Instance Boosting 81 4.4.1 MCMIBoost: Confidence Value Evaluation 82 4.4.2 Cascaded MCMIBoost Architecture 83 4.4.3 Probability Combination Classifier 84 4.5 Experiment Results 86 4.5.1 Results on the MIT Dataset 88 4.5.2 Results on the INRIA Dataset 91 4.6 Visual Surveillance Application 93 4.7 Summary 98 5 Conclusion and Future Work 99 5.1 Conclusion 99 5.2 Future Work 101 Bibliography 103 Publications 111
dc.language.iso	en
dc.title	電腦視覺技術於行人偵測之研究	zh_TW
dc.title	Computer Vision Techniques for Effective Pedestrian Detection	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	博士
dc.contributor.coadvisor	陳祝嵩(Chu-Song Chen)
dc.contributor.oralexamcommittee	王傑智,傅立成,王聖智,賴尚宏,陳世旺,鍾國亮
dc.subject.keyword	階層式背景模型,差值統計圖,人員偵測,整體行人偵測,階層式前饋分類器,meta階層,AdaBoost,部件行人偵測,多重實例學習,多類多重實例boosting,特徵共用,視覺監控,	zh_TW
dc.subject.keyword	Hierarchical background modeling,contrast histogram,human detection,holistic pedestrian detection,cascaded feed-forward classifiers,meta-stages,AdaBoost,part-based pedestrian detection,multi-instance learning,multi-class multi-instance boosting,feature sharing,visual surveillance,	en
dc.relation.page	112
dc.rights.note	有償授權
dc.date.accepted	2009-05-14
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	12.71 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。