多媒體內容分析系統之演算法與積體電路架構設計

Tse-Wei Chen; 陳則瑋

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48685

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	簡韶逸(Shao-Yi Chien)
dc.contributor.author	Tse-Wei Chen	en
dc.contributor.author	陳則瑋	zh_TW
dc.date.accessioned	2021-06-15T07:08:22Z	-
dc.date.available	2015-11-15
dc.date.copyright	2010-11-15
dc.date.issued	2010
dc.date.submitted	2010-11-01
dc.identifier.citation	[1] S.-Y. Chien, S.-Y. Ma, and L.-G. Chen, “Efficient moving object segmentation algorithm using background registration technique,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 7, pp. 577–586, Jul. 2002. [2] S. Sural, G. Qian, and S. Pramanik, “Segmentation and histogram generation using the HSV color space for image retrieval,” in Proceedings of IEEE International Conference on Image Processing, Sep. 2002, pp. 589–592. [3] L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Workshop on Generative-Model Based Vision, 2004. [4] A. Abbo, R. Kleihorst, V. Choudhary, L. Sevat, P.Wielage, S.Mouy, and M. Heijligers, “XETAL-II: A 107 GOPS, 600mW massively-parallel processor for video scene analysis,” in Digest of Technical Papers of 2007 IEEE International Solid-State Circuits Conference (ISSCC2007), Feb. 2007, pp. 270–271. [5] K. Kim, S. Lee, J.-Y. Kim, M. Kim, D. Kim, J.-H. Woo, and H.-J. Yoo, “A 125GOPS 583mW Network-on-Chip based parallel processor with bio-inspired visual attention engine,” in Digest of Technical Papers of 2008 IEEE International Solid-State Circuits Conference (ISSCC2008), Feb. 2008, pp. 308–309. [6] C.-C. Cheng, C.-H. Lin, C.-T. Li, S. Chang, C.-J. Hsu, and L.-G. Chen, “iVisual: An intelligent visual sensor SoC with 2790fps CMOS image sensor and 205GOPS/W vision processor,” in Digest of Technical Papers of 2008 IEEE International Solid-State Circuits Conference (ISSCC2008), Feb. 2008, pp. 306–307. [7] J. C. Chen and S.-Y. Chien, “CRISP: coarse-grained reconfigurable image stream processor for digital still cameras and camcorders,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 9, pp. 1223–1236, 2008. [8] S. Arakawa, Y. Yamaguchi, S. Akui, Y. Fukuda, H. Sumi, H. Hayashi, M. Igarashi, K. Ito, H. Nagano, M. Imai, and N. Asari, “A 512GOPS fully-programmable digital image processor with full HD 1080p processing capabilities,” in Digest of Technical Papers of 2008 IEEE International Solid-State Circuits Conference (ISSCC2008), Feb. 2008, pp. 312–313. [9] J.-Y. Kim, M. Kim, S. Lee, J. Oh, K. Kim, S. Oh, J.-H. Woo, D. Kim, and H.-J. Yoo, “A 201.4GOPS 496mW real-time multi-object recognition processor with bio-inspired neural perception engine,” in ISSCC Digest of Technical Papers, Feb 2009, pp. 150–151. [10] P. Dubey, “Recognition, mining and synthesis moves computers to the era of Tera,” Technology@Intel Magazine, February 2005. [11] J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, and T. J. Purcell, “A survey of general-purpose computation on graphics hardware,” Computer Graphics Forum, vol. 26, no. 1, pp. 80–113, 2007. [12] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, Dec. 2001, pp. 511–518. [13] R. Bez, E. Camerlenghi, A.Modelli, and A. Visconti, “Introduction to flash memory,” Proceedings of the IEEE, vol. 91, no. 4, pp. 489–502, 2003. [14] G. Antoniou and F. van Harmelen, A Semantic Web Primer, 2nd ed. MIT Press, 2008. [15] J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo, “Evaluating bag-of-visual-words representations in scene classification,” in Proceedings of International Workshop on Multimedia Information Retrieval, 2007, pp. 197–206. [16] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8. [17] B. Fasel and J. Luettin, “Automatic facial expression analysis: A survey,” Pattern Recognition, vol. 36, no. 1, pp. 259–275, Jan. 2003. [18] T. M. Mitchell, Machine Learning. McGraw Hill, 1997. [19] E. Alpaydin, Introduction to Machine Learning. MIT Press, 2004. [20] F. M. Khan, M. G. Arnold, and W. M. Pottenger, “Hardware-based support vector machine classification in logarithmic number systems,” in Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on, 2005, pp. 5154–5157. [21] M. Shi and A. Bermak, “An efficient digital VLSI implementation of Gaussian mixture models-based classifier,” IEEE Transactions on Very Large Scale Integration (VLSI) System, vol. 14, no. 9, pp. 962–974, 2006. [22] T. Saegusa and T. Maruyama, “An FPGA implementation of real-time K-Means clustering for color images,” Journal of Real-Time Image Processing, vol. 2, no. 4, pp. 309–318, Nov 2007. [23] U. J. Kapasi, S. Rixner,W. J. Dally, B. Khailany, J. H. Ahn, P. Mattson, and J. D. Owens, “Programmable stream processors,” Computer, vol. 36, no. 8, pp. 54–62, 2003. [24] C.-H. Sun, Y.-M. Tsao, K.-H. Lok, and S.-Y. Chien, “Multimedia system-on-a-chip: Low power multi-purpose GPU with multi-core stream processing unit, universal rasterizer, and mipmapping texture compression,” in Proceedings of IEEE Symposium on Low-Power and High-Speed Chips (COOL Chips XIII), 2010, pp. 365–367. [25] A. Elgammal, D. Harwood, and L. Davis, “Non-parametric model for background subtraction,” in Proceedings of the 6th European Conference on Computer Vision, 2000. [26] C. Stauffer and W. Grimson, “Adaptive background mixture models for real-time tracking,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 1999, pp. 23–25. [27] G. Fung, N. Yung, G. Pang, and A. Lai, “Towards detection of moving cast shadows for visual traffic surveillance,” in Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, Oct. 2001, pp. 2505–2510. [28] D.Wang, “Unsupervised video segmentation based on watersheds and temporal tracking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, pp. 539–546, Sep. 1998. [29] Y.-P. Tsai, C.-C. Lai, Y.-P. Hung, and Z.-C. Shih, “A bayesian approach to video object segmentation via merging 3-D watershed volumes,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 1, pp. 175–180, Jan. 2005. [30] D. Pelleg and A. Moore, “X-means: Extending k-means with efficient estimation of the number of clusters,” in Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann, 2000, pp. 727–734. [31] F. Sattar, L. Floreby, G. Salomonsson, and B. Lovstrom, “Image enhancement based on a nonlinear multiscale method,” IEEE Transactions on Image Processing, vol. 6, no. 6, pp. 888–895, Jun. 1997. [32] The Chinese Image Processing and Pattern Recognition Society (IPPR). [Online]. Available: http://www.ippr.org.tw/ [33] L. Luccheseyz and S. K. Mitray, “Color image segmentation: A state-of-the-art survey,” in Proceedings of the Indian National Science Academy, Mar. 2001, pp. 207–221, (Invited Paper). [34] M. Luo, Y.-F. Ma, and H.-J. Zhang, “A spatial constrained K-Means approach to image segmentation,” in Proceedings of the Joint Conference of International Conference on Information, Communications and Signal Processing, and Pacific Rim Conference on Multimedia, vol. 2, Dec. 2003, pp. 738–742. [35] V. Mezaris, I. Kompatsiaris, and M. G. Strintzis, “Still image segmentation tools for content-based multimedia applications,” International Journal of pattern recognition and artificial intelligence, vol. 18, no. 4, pp. 701–725, Jun. 2004. [36] T. Elomaa and H. Koivistoinen, “On autonomous K-Means clustering,” in Proceedings of International Symposium on Methodologies for Intelligent Systems, May 2005, pp. 228–236. [37] Y. Rui, T. S. Huang, and S.-F. Chang, “Image retrieval: Current techniques, promising directions, and open issues,” Journal of Visual Communication and Image Representation, vol. 10, no. 1, pp. 39–62, Mar. 1999. [38] M. Ozden and E. Polat, “A color image segmentation approach for content-based image retrieval,” Pattern Recognition, vol. 40, no. 4, pp. 1318–1325, 2007. [39] J.-W. Hsieh, W. E. L. Grimson, C.-C. Chiang, and Y.-S. Huang, “Region-based image retrieval,” in Proceedings of IEEE International Conference on Image Processing, Sep. 2000, pp. 77–80. [40] J.-W. Hsieh and W. E. L. Grimson, “Spatial template extraction for image retrieval by region matching,” IEEE Transactions on Image Processing, vol. 12, no. 11, pp. 1404–1415, 2003. [41] B. Ko, H.-S. Lee, and H. Byun, “Region-based image retrieval system using efficient feature description,” in Proceedings of the International Conference on Pattern Recognition, Sep. 2000, pp. 283–286. [42] J. R. Smith and S.-F. Chang, “VisualSEEk: A fully automated content-based image query system,” in ACM Multimedia, 1996, pp. 87–98. [43] C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik, “Blobworld: A system for region-based image indexing and retrieval,” in Proceedings of Third International Conference on Visual Information Systems, 1999, pp. 509–516. [44] J. Z. Wang, G. Wiederhold, O. Firschein, and S. X. Wei, “Content-based image indexing and searching using daubechies’ wavelets,” International Journal on Digital Libraries, vol. 1, no. 4, pp. 311–328, 1997. [45] W.-Y. Ma and B. S. Manjunath, “NeTra: A toolbox for navigating large image databases,” Multimedia Systems, vol. 7, no. 3, pp. 184–198, 1999. [46] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-sensitive integrated matching for picture libraries,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947–963, 2001. [47] A. Dong and B. Bhanu, “Active concept learning in image databases,” IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, vol. 35, no. 3, pp. 450–466, Jun. 2005. [48] R. M. Haralick, “Statistical and structural approaches to texture,” Proceedings of the IEEE, vol. 67, no. 5, pp. 786–804, May 1979. [49] H. Tamura, S. Mori, and T. Yamawaki, “Textural features corresponding to visual perception,” IEEE Transactions on Systems, Man and Cybernetics, vol. 8, no. 6, pp. 460–473, Jun. 1978. [50] B. S. Manjunath andW. Y.Ma, “Texture features for browsing and retrieval of image data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 837–842, Aug. 1996. [51] G. V. D. Wouwer, P. Scheunders, and D. V. Dyck, “Statistical texture characterization from discrete wavelet representations,” IEEE Transactions on Image Processing, vol. 8, pp. 592–598, 1999. [52] M.Kokare, P. K. Biswas, and B. N. Chatterji, “Texture image retrieval using new rotated complex wavelet filters,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 35, no. 6, pp. 1168–1178, Dec. 2005. [53] S. Livens, P. Scheunders, G. Wouwer, and D. Dyck, “Wavelets for texture analysis, an overview,” in Proceedings of Sixth International Conference on Image Processing and Its Applications, vol. 2, Jul. 1997, pp. 581–585. [54] J. R. Smith, “Color for image retrieval,” in Image Databases. JohnWiley & Sons, Inc., 2002, ch. 11, pp. 285–311. [55] Z.-K. Huang and D.-H. Liu, “Segmentation of color image using EM algorithm in HSV color space,” in Proceedings of IEEE International Conference on Information Acquisition, Jul. 2007, pp. 316–319. [56] W. Chen, Y. Q. Shi, and G. Xuan, “Identifying computer grahics using HSV color model and statistical moments of characteristic functions,” in Proceedings of IEEE International Conference on Multimedia and Expo, Jul. 2007, pp. 1123–1126. [57] J. R. Smith and S.-F. Chang, “Single color extraction and image query,” in Proceedings of IEEE International Conference on Image Processing, Oct. 1995, pp. 528–531. [58] S. G. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674–693, 1989. [59] X. Wen, T. D. Huffmire, H. H. Hu, and A. Finkelstein, “Wavelet-based video indexing and querying,” Multimedia System, vol. 7, no. 5, pp. 350–358, 1999. [60] N. Suematsu, Y. Ishida, A. Hayashi, and T. Kanbara, “Region-based image retrieval using wavelet transform,” in Proceedings of 10th International Workshop on Database and Expert Systems Applications, 1999, pp. 167–173. [61] P. Brodatz, Textures: A Photographic Album for Artists and Designers. New York: Dover Publications, Inc., 1966. [62] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jai, “Content-based image retrieval at the end of the early years,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349–1380, Dec. 2000. [63] T.-W. Chen, S.-C. Hsu, and S.-Y. Chien, “Robust video object segmentation based on K-Means background clustering and watershed in ill-conditioned surveillance systems,” in Proceedings of IEEE International Conference on Multimedia and Expo, Jul. 2007, pp. 787–790. [64] J. M. Pe˜na, J. A. Lozano, and P. Larra˜naga, “An empirical comparison of four initialization methods for the K-Means algorithm,” Pattern Recognition Letters, vol. 20, no. 10, pp. 1027–1040, 1999. [65] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient K-Means clustering algorithm: analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, July 2002. [66] M. Estlick, M. Leeser, J. Theiler, and J. J. Szymanski, “Algorithmic transformations in the implementation of K-Means clustering on reconfigurable hardware,” in Proceedings of ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2001, pp. 103–110. [67] W.-C. Liu, J.-L. Huang, and M.-S. Chen, “KACU: K-Means with hardware centroid-updating,” in Proceedings of the 5th Emerging Information Technology Conference, Aug 2005. [68] T. Maruyama, “Real-time K-Means clustering for color images on reconfigurable hardware,” in Proceedings of International Conference on Pattern Recognition, 2006, pp. 816–819. [69] T. Saegusa and T. Maruyama, “An FPGA implementation of K-Means clustering for color images based on KD-Tree,” in Proceedings of International Conference on Field Programmable Logic and Applications, Aug 2006, pp. 1–6. [70] A. G. da S. Filho, A. C. Frery, C. C. de Ara´ujo, H. Alice, J. Cerqueira, J. A. Loureiro, M. E. de Lima, M. das G. S. Oliveira, and M. M. Horta, “Hyperspectral images clustering on reconfigurable hardware using the K-Means algorithm,” in Proceedings of Symposium on Integrated Circuits and Systems Design, Sep 2003, pp. 99–104. [71] ARM Limited, “AMBA Specification (Rev 2.0).” [Online]. Available: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0011a/ [72] B. Maliatski and O. Yadid-Pecht, “Hardware-driven adaptive K-Means clustering for real-time video imaging,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 1, pp. 164–166, Jan 2005. [73] ARM Limited, “AHB Example AMBA System.” [Online]. Available: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0170a/DDI0170.pdf [74] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999. [75] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297. [76] J. Yu, “General C-Means clustering model,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1197–1211, 2005. [77] K. Krishna, K. R. Ramakrishnan, and M. A. L. Thathachar, “Vector quantization using genetic K-Means algorithm for image compression,” in Proceedings of International Conference on Information, Communications and Signal Processing, Sep. 1997, pp. 1585–1587. [78] B. Kぴovesi, J.-M. Boucher, and S. Saoudib, “Stochastic K-Means algorithm for vector quantization,” Pattern Recognition Letters, vol. 22, no. 6, pp. 603–610, 2001. [79] H.-S. Chiu, G.-Y. Chen, C.-J. Lee, and B. Chen, “Position information for language modeling in speech recognition,” in Proceedings of International Symposium on Chinese Spoken Language Processing, Dec. 2008, pp. 1–4. [80] T.-W. Chen, Y.-L. Chen, and S.-Y. Chien, “Fast image segmentation based on K-Means clustering with histograms in HSV color space,” in Proceedings of IEEE International Workshop on Multimedia Signal Processing, Oct. 2008. [81] J. R. Jensen, Introductory Digital Image Processing. Prentice Hall, 1996. [82] S. Phillips, “Reducing the computation time of the Isodata and K-Means unsupervised classification algorithms,” in Proceedings of IEEE International Geoscience and Remote Sensing Symposium, 2002, pp. 1627–1629. [83] K. Krishna and M. Narasimha Murty, “Genetic K-Means algorithm,” IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, vol. 29, no. 3, pp. 433–439, 1999. [84] T.-W. Chen, C.-H. Sun, J.-Y. Bai, H.-R. Chen, and S.-Y. Chien, “Architectural analyses of K-Means silicon intellectual property for image segmentation,” in Proceedings of IEEE International Symposium on Circuits and Systems, May 2008, pp. 2578–2581. [85] R. O. Duda, P. E. Hard, and D. G. Stork, Pattern Classification, 2nd ed. Wiley Interscience, 2000. [86] Y.-C. Hu and M.-G. Lee, “K-Means-based color palette design scheme with the use of stable flags,” Journal of Electronic Imaging, vol. 16, no. 3, pp. 033 003 (1–11), 2007. [87] S. Ray and R. Turi, “Determination of number of clusters in K-Means clustering and application in colour image segmentation,” in Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques, 1999, pp. 137–143. [88] M. Leeser, J. Theiler, M. Estlick, and J. J. Szymanski, “Design tradeoffs in a hardware implementation of the K-Means clustering algorithm,” in Proceedings of IEEE Sensor Array and Multichannel Signal Processing Workshop, 2000, pp. 520–524. [89] T.-W. Chen and S.-Y. Chien, “Bandwidth adaptive hardware architecture of K-Means clustering for video analysis,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 6, pp. 957–966, 2010. [90] T.-S. Chen, T.-H. Tsai, Y.-T. Chen, C.-C. Lin, R.-C. Chen, S.-Y. Li, and H.-Y. Chen, “A combined K-Means and hierarchical clustering method for improving the clustering efficiency of microarray,” in Proceedings of IEEE International Symposium on Intelligent Signal Processing and Communication Systems, Dec. 2005, pp. 405–408. [91] N. Chehata and F. Bretar, “Terrain modeling from LIDAR data: hierarchical K-Means filtering and Markovian regularization,” in Proceedings of IEEE International Conference on Image Processing, 2008, pp. 1900–1903. [92] Y.-C. F. Wang and D. Casasent, “Hierarchical K-Means clustering using new support vector machines for multi-class classification,” in Proceedings of International Joint Conference on Neural Networks, 2006, pp. 3457–3464. [93] Y. Liu and Z. Liu, “An improved hierarchical K-Means algorithm for web document clustering,” in Proceedings of International Conference on Computer Science and Information Technology, 2008, pp. 606–610. [94] D. Nist´er and H. Stew´enius, “Scalable recognition with a vocabulary tree,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, pp. 2161–2168. [95] S. Hamilton, “Semiconductor research corporation: Taking moore’s law into the next century,” Computer, vol. 32, no. 1, pp. 43–48, 1999. [96] J. Meng, S. Chakradhar, and A. Raghunathan, “Best-effort parallel execution framework for recognition and mining applications,” in IEEE International Symposium on Parallel and Distributed Processing, 2009, pp. 1–12. [97] J. Fauqueur and N. Boujemaa, “Region-based image retrieval: fast coarse segmentation and fine color description,” Journal of Visual Languages and Computing, vol. 15, no. 1, pp. 69–95, February 2004. [98] M. R. Boutell, J. Luo, and C.M. Brown, “Scene parsing using region-based generative models,” IEEE Transactions on Multimedia, vol. 9, no. 1, pp. 136–146, 2007. [99] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proceedings of International Conference on Computer Vision, vol. 2, July 2001, pp. 416–423. [100] X. Ren and J. Malik, “Learning a classification model for segmentation,” in Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, pp. 10–17. [101] G. Mori, “Guiding model search using segmentation,” in Proceedings of the Tenth IEEE International Conference on Computer Vision, 2005, pp. 1417–1423. [102] J. Adams, K. Parulski, and K. Spaulding, “Color processing in digital cameras,” IEEE Micro, vol. 18, no. 6, pp. 20–30, 1998. [103] T. Yokoyama, S. Furukawa, and T. Watanabe, “Moving region detection by transportation problem solving,” in Proceedings of the Ninth IEEE International Symposium on Multimedia, Dec. 2007, pp. 86–91. [104] Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” International Journal of Computer Vision, vol. 40, no. 2, pp. 99–121, 2000. [105] B. Ko and H. Byun, “Integrated region-based image retrieval using region’s spatial relationships,” in Proceedings of the International Conference on Pattern Recognition, Aug. 2002, pp. 196–199. [106] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge, “Comparing images using the hausdorff distance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, pp. 850–863, 1993. [107] K. Fukuda and T. Matsui, “Finding all minimum cost perfect matchings in bipartite graphs,” Networks, vol. 22, no. 5, pp. 461–468, 1992. [108] K. R. Varadarajan and P. K. Agarwal, “Approximation algorithms for bipartite and nonbipartite matching in the plane,” in SODA ’99: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, 1999, pp. 805–814. [109] Y.-Q. Cheng, V. Wu, R. T. Collins, A. R. Hanson, and E. M. Riseman,“Maximum-weight bipartite matching technique and its application in image feature matching,” in Proceedings of SPIE Visual Communication and Image Processing, 1996. [110] H. A. B. Saip and C. L. Lucchesi, “Matching algorithms for bipartite graphs,” Technical Report DCC -03/93, 1993, departamento de Cincia daComputao, Universidade Estudal de Campinas. [111] A. K. Jain, Y. Zhou, T. Mustufa, E. C. Burdette, G. S. Chirikjian, and G. Fichtinger, “Matching and reconstruction of brachytherapy seeds using the hungarian algorithm (MARSHAL),” Medical Physics, vol. 32, no. 11, pp. 3475–3492, 2005. [112] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categoires,” in International Conference of Computer Vision, 2006, pp. 2169–2178. [113] S.-Y. Chien and T.-W. Chen, “Motion adaptive spatio-temporal Gaussian noise reduction filter for double-shot images,” in IEEE International Conference on Multimedia and Expo, 2007, pp. 1659–1662. [114] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, pp. 91–110, 2004. [115] J. D. Hall and J. C. Hart, “GPU acceleration of iterative clustering,” in Proceedings of ACM Workshop on General Purpose Computing on Graphics Processors, 2004. [116] B. Catanzaro, B.-Y. Su, N. Sundaram, Y. Lee, M. Murphy, and K. Keutzer,“Efficient, high-quality image contour detection,” in International Conference of Computer Vision, 2009, pp. 2381–2388. [117] Y. Deng, B. S. Manjunath, and H. Shin, “Color image segmentation,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, Jun. 1999, pp. 446–451. [118] M. Shi and A. Bermak, “An efficient digital VLSI implementation of Gaussian mixture models-based classifier,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, no. 9, pp. 962–974, Sep. 2006. [119] Y. Ma and T. Shibata, “A binary-tree hierarchical multiple-chip architecture for real-time large-scale learning processor systems,” Japanese Journal of Applied Physics, vol. 49, no. 4, p. 04DE08, 2010. [120] S. Yoshihara, Y. Nitta, M. Kikuchi, K. Koseki, Y. Ito, Y. Inada, S. Kuramochi, H.Wakabayashi, M. Okano, H. Kuriyama, J. Inutsuka, A. Tajima, T. Nakajima, Y. Kudoh, F. Koga, Y. Kasagi, S. Watanabe, and T. Nomoto, “A 1/1.8-inch 6.4 MPixel 60 frames/s CMOS image sensor with seamless mode change,” IEEE Journal of Solid-State Circuits, vol. 41, no. 12, pp. 2998–3006, 2006. [121] C. Tomasi and R.Manduchi, “Bilateral filtering for gray and color images,” in Proceedings of International Conference on Computer Vision, 1998, pp. 839–846. [122] T. S. Lee, “Image representation using 2D Gabor wavelets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 959–971, 1996. [123] I. Hatirnaz, F. K. Gurkaynak, and Y. Leblebici, “Realization of a programmable rank-order filter architecture using capacitive threshold logic gates,” in Proceedings of IEEE International Symposium on Circuits and Systems, vol. 1, Jul. 1999, pp. 435–438. [124] S.-C. Hsia andW.-C. Hsu, “A parallel median filter with pipelined scheduling for real-time 1D and 2D signal processing,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E83-A, no. 7, pp. 1396–1404, 2000. [125] P. F. Felzenszwalb and D. P. Huttenlocher, “Pictorial structures for object recognition,” International Journal of Computer Vision, vol. 61, no. 1, pp. 55–79, 2005. [126] D. J. Cook, J. C. Augusto, and V. R. Jakkula, “Ambient intelligence: Technologies, applications, and opportunities,” Pervasive and Mobile Computing, vol. 5, no. 4, pp. 277–298, August 2009. [127] S. Bloehdorn, K. Petridis, C. Saathoff, N. Simou, Y. Avrithis, S. H, Y. Kompatsiaris, and M. G. Strintzis, “Semantic annotation of images and videos for multimedia analysis,” in Proceedings of the 2nd European Semantic Web Conference, 2005, pp. 592–607. [128] M. Hiromoto, H. Sugano, and R.Miyamoto, “Partially parallel architecture for adaboost-based detection with haar-like features,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 1, pp. 41–52, 2009. [129] S.-Y. Chien and L.-G. Chen, “Reconfigurable morphological image processing accelerator for video object segmentation,” Journal of Signal Processing Systems, 2010, (To be published). [130] C.-C. Chang and C.-J. Lin, “LIBSVM: a library for support vector machines,” 2001, software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48685	-
dc.description.abstract	近年來由於半導體製程技術的發展，愈來愈多的應用出現在消費性電子產品之中。各種不同功能的電子產品，例如攜帶電話，數位相機，掌上型電腦，也在時代的趨勢下逐漸整合成一種具備各種功能的完整系統。同時，記憶體的容量不斷地提昇，但是其價格與生產成本卻是不斷下降。可以預見地，未來傳統的硬碟極有可能會被取代成為擁有更大容量的先進記憶元件。在擁有了極大量的資料儲存空間之後，儲存多媒體資訊就會成為一大用途，也使得自動化的多媒體分析成為重要的應用。在消費性電子產品的嵌入式系統之中，傳統的中央處理器，以及客製化的數位積體電路無法同時滿足多媒體分析演算法的彈性以及效能須求。因此，在下一代的應用之中，發展一套新的軟硬體設計方式是很重要的。針對多媒體內容分析的須求，作者提出了一套新的設計及實作方式。從演算法設計，硬體架構分析，軟硬體共同設計，以及單晶片系統實作，發展出一系列的方法，並且適合應用於各種消費性電子產品之中，例如行動式裝置。為了有效地分析多媒體的內容，「特徵截取」以及「機器學習」是不可或缺的步驟。現今有各種機器學習的演算法被應用著，包括監督式學習、非監督式學習……等。這些演算法被視為多媒體內容分析的重要組成元素。因此，作者提出了高效能的客製化硬體架構以及可重組化的多功能硬體架構來支援「機器學習」並且處理多媒體的內容分析。高效能客製化硬體架構的部份，由於K平均分群法是機器學習中非監督式學習的一個非常重要之演算法，作者針對此方法做了很多的分析與探討。論文中提出四種不同的K平均分群法架構，分別適用於不同的應用環境，並且展示了結合所提出硬體架構的一個軟硬體共同設計系統，用以執行自動化的相片檢索功能。可重組化的多功能硬體架構部份，為了支援不同的機器學習演算法，兩種不同特色的單晶片系統也在本論文中被提出。這兩種系統藉由密集平行化的串流處理器架構處理大量的影像「特徵截取」運算，也可以基於可重組化的硬體架構及高頻寬的記憶單元，支援不同「機器學習」演算法的運算，包括K平均分群法、K最近鄰居分類器、高斯模型分類器、支援向量機、類神經網路……等。簡而言之，本論文提出了兩種對於視訊以及影像的切割演算法，四種不同的K平均分群法硬體架構，一個軟硬體共同設計之相片檢索系統，以及兩個支援特徵截取以及機器學習演算法之高效能單晶片系統，作為多媒體分析系統運算的一系列解決方案。	zh_TW
dc.description.abstract	Nowadays, thanks to the development of semiconductor technology, there are more and more versatile applications in Consumer Electronics (CE) products. Different kinds of CE products, such as cellular phones, digital still cameras, portable computers, are gradually integrated into one single system. In the near future, a CE product might include different functionalities, including making phone calls, sending e-mail, and taking/storing photos. At the same time, the advance of memory is also astonishing. The size of flash memory is increasing, but its price is decreasing steadily. Obviously, the new memory technology might replace the traditional hard disk for data storage. Because of the development of the Internet and the large storage of data, managing multimedia content becomes an important and indispensable task. Therefore, the integration with different kinds of functionalities and the increase of multimedia data result in the necessity of automatic multimedia content analysis for CE products. In embedded systems for CE products, the traditional CPU/RISC and ASIC cannot satisfy both the flexibility and performance requirements of multimedia applications based on their architectures, so the exploration of new design methodologies and solutions are needed for next-generation applications. In this dissertation, new implementation methods and frameworks of multimedia content analysis are proposed. From algorithm designs, architectural analyses, hardware architectural designs, software/hardware co-designs, and SoC designs, a systematic approach is adopted. The proposed methods provide a series of new solutions to next-generation applications for consumer electronics (e.g. mobile devices). To effectively analyze the contents of multimedia, feature extraction and machine learning algorithms are both indispensable. There are lots of machine learning algorithms that are widely employed in different applications, and they can be regarded as essential components or building blocks for multimedia content analysis. To handle the supervised learning and unsupervised learning algorithms in machine learning, both high-performance hardware architectures and reconfigurable hardware architectures are proposed. For high-performance architectures, K-Means clustering algorithm is the focus in this dissertation because of its popularity and importance, and its applications are also demonstrated. A total of four kinds of K-Means architectures are developed. For reconfigurable hardware architectures, two System-on-a-Chip (SoC) architectures with different features are proposed. These systems can process a large amount of data in parallel and perform feature extraction with high bandwidth, and they can also deal with various kinds of machine learning algorithms, such as K-Means clustering, K-Nearest Neighbor classification, Gaussian Mixture Model-based classification, Support Vector Machine, and Artificial Neural Network. In short, the contribution of this dissertation consists essentially of two algorithms for video and image segmentation, one software/hardware co-design platform, four different kinds of architectures for K-Means clustering, and two SoCs for multimedia content analysis. The content of this dissertation can also be regarded as a series of new solutions to multimedia content analysis for CE products.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T07:08:22Z (GMT). No. of bitstreams: 1 ntu-99-F95943020-1.pdf: 4413071 bytes, checksum: 01a1ada1c8f00a7a5cf93b12e370f617 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	Abstract xix 1 Introduction 1 1.1 Multimedia Content Analysis . . . . . . . . . . . . . . . . . . . . 2 1.2 Algorithm Overview . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . 5 1.3 System Architectures . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.1 Stream Processor . . . . . . . . . . . . . . . . . . . . . . 7 1.3.2 Reconfigurable Hardware . . . . . . . . . . . . . . . . . 7 1.3.3 Design Challenges . . . . . . . . . . . . . . . . . . . . . 8 1.4 Research Contributions and Dissertation Organization . . . . . . . 10 1.4.1 Algorithm Design . . . . . . . . . . . . . . . . . . . . . 10 1.4.2 Architectural Analysis and Design . . . . . . . . . . . . . 11 1.4.3 Software/Hardware Co-Design . . . . . . . . . . . . . . . 12 1.4.4 SoC Design . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Algorithm Design: Video Object Segmentation Based on K-Means Background Clustering and Watersheds 17 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 Background Modeling . . . . . . . . . . . . . . . . . . . 18 2.2.2 Object Mask Generation . . . . . . . . . . . . . . . . . . 22 2.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3 Algorithm Design: Fast Image Segmentation and Texture Feature Extraction for Image Retrieval 31 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Proposed Image Segmentation Method . . . . . . . . . . . . . . . 34 3.2.1 Color Space Transform and Histogram Generation . . . . 34 3.2.2 Maximin Initialization and Parameter Estimation . . . . . 38 3.2.3 K-Means Clustering in HSV Color Space . . . . . . . . . 38 3.2.4 Post-Processing of Spatial Regions . . . . . . . . . . . . 42 3.3 Proposed Texture Feature Extraction Technique . . . . . . . . . . 42 3.3.1 Texture Feature based on Discrete Wavelet Transform . . 44 3.3.2 Proposed Technique based on Label Wavelet Transform . 45 3.3.3 Retrieving Process . . . . . . . . . . . . . . . . . . . . . 47 3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 48 3.4.1 Fast Color Image Segmentation . . . . . . . . . . . . . . 48 3.4.2 Efficient Texture Feature Extraction . . . . . . . . . . . . 50 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4 Architectural Analyses of K-Means Silicon Intellectual Property for Image Segmentation 59 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.1 K-Means Algorithm . . . . . . . . . . . . . . . . . . . . 61 4.2.2 Hardware Implementation . . . . . . . . . . . . . . . . . 63 4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 Architectural Design: Bandwidth Adaptive Hardware Architecture of K-Means Clustering for Video Analysis 73 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2 K-Means Algorithm and Hardware Considerations . . . . . . . . 75 5.3 Proposed Hardware Architecture . . . . . . . . . . . . . . . . . . 81 5.3.1 Control Unit and Pseudo Random Number Generator . . . 82 5.3.2 Parallel E-M Distance Calculator Set . . . . . . . . . . . 84 5.3.3 8-Layer Parallel M-S PE Set . . . . . . . . . . . . . . . . 85 5.3.4 Summation Updating Engine . . . . . . . . . . . . . . . . 86 5.3.5 Vector Divider . . . . . . . . . . . . . . . . . . . . . . . 90 5.3.6 Convergence Monitor and Labeling Engine . . . . . . . . 93 5.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6 Architectural Design: Flexible Hardware Architecture of Hierarchical K-Means Clustering for Large Cluster Number 101 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2 Hierarchical K-Means (HK-Means) Algorithm. . . . . . . . . . . 103 6.3 Hardware Architecture Overview . . . . . . . . . . . . . . . . . . 108 6.4 Hierarchical Memory . . . . . . . . . . . . . . . . . . . . . . . . 111 6.4.1 Memory Cost Analysis . . . . . . . . . . . . . . . . . . . 111 6.4.2 Memory Architecture . . . . . . . . . . . . . . . . . . . . 114 6.5 Data Drain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.5.1 Traverse Processing Element (Traverse PE) . . . . . . . . 118 6.5.2 Centroid Visiting in Binary Tree . . . . . . . . . . . . . . 120 6.6 Iteration Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.7 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 123 6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7 Software/Hardware Co-Design: Photo Retrieval based on Spatial Layout with Hardware Acceleration 131 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7.2 Proposed Photo Retrieval System . . . . . . . . . . . . . . . . . . 135 7.2.1 Image Segmentation . . . . . . . . . . . . . . . . . . . . 137 7.2.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . 139 7.2.3 Image Matching . . . . . . . . . . . . . . . . . . . . . . 142 7.2.4 Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 7.3 Proposed K-Means Accelerator . . . . . . . . . . . . . . . . . . . 144 7.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 147 7.3.2 Characteristics and Applications . . . . . . . . . . . . . . 152 7.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 153 7.4.1 Photo Retrieval Algorithm . . . . . . . . . . . . . . . . . 153 7.4.2 Robustness Evaluation . . . . . . . . . . . . . . . . . . . 159 7.4.3 K-Means Hardware Accelerator . . . . . . . . . . . . . . 163 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 8 SoC Design: Tera-scale Performance Machine Learning SoC (MLSoC) with Dual Stream Processor Architecture for Multimedia Content Analysis 167 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 8.2 Machine Learning SoC (MLSoC) Architecture . . . . . . . . . . 169 8.3 Image Stream Processor . . . . . . . . . . . . . . . . . . . . . . 171 8.3.1 Linear Processor . . . . . . . . . . . . . . . . . . . . . . 175 8.3.2 Order Processor . . . . . . . . . . . . . . . . . . . . . . . 177 8.4 Feature Stream Processor . . . . . . . . . . . . . . . . . . . . . . 178 8.5 VLSI Implementation . . . . . . . . . . . . . . . . . . . . . . . . 181 8.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 8.7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 186 9 SoC Design: Multimedia Semantic Analysis SoC (SASoC) with Machine-Learning Engine 187 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 9.2 System Architecture Design . . . . . . . . . . . . . . . . . . . . 192 9.2.1 Image Stream Processing System (ISPS) . . . . . . . . . 192 9.2.2 Feature Stream Processing System (FSPS) . . . . . . . . 194 9.2.3 Example Applications . . . . . . . . . . . . . . . . . . . 197 9.3 Chip Design Flow and Verification Strategies . . . . . . . . . . . 198 9.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 200 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 9.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 206 10 Conclusions and Future Directions 209 10.1 Principal Contributions . . . . . . . . . . . . . . . . . . . . . . . 209 10.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 210 10.2.1 New Algorithms for Multimedia Content Analysis . . . . 211 10.2.2 New System Architectures for Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Reference 213
dc.language.iso	en
dc.subject	機器學習	zh_TW
dc.subject	數位電路	zh_TW
dc.subject	多媒體內容分析	zh_TW
dc.subject	硬體架構	zh_TW
dc.subject	digital circuit	en
dc.subject	machine learning	en
dc.subject	hardware architecture	en
dc.subject	multimedia content analysis	en
dc.title	多媒體內容分析系統之演算法與積體電路架構設計	zh_TW
dc.title	Algorithm and VLSI Architecture Design of Multimedia Content Analysis System	en
dc.type	Thesis
dc.date.schoolyear	99-1
dc.description.degree	博士
dc.contributor.oralexamcommittee	陳良基(Liang-Gee Chen),徐宏民(Winston H. Hsu),傅楸善(Chiou-Shann Fuh),杭學鳴(Hsueh-Ming Hang),賴尚宏(Shang-Hong Lai),張添烜(Tian-Sheuan Chang),廖弘源(Hong-Yuan Mark Liao)
dc.subject.keyword	數位電路,多媒體內容分析,硬體架構,機器學習,	zh_TW
dc.subject.keyword	digital circuit,multimedia content analysis,hardware architecture,machine learning,	en
dc.relation.page	229
dc.rights.note	有償授權
dc.date.accepted	2010-11-04
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
Appears in Collections:	電子工程學研究所

Files in This Item:

File	Size	Format
ntu-99-1.pdf Restricted Access	4.31 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets