針對盲人輔助應用之視覺辨識及導航系統的演算法及架構設計

Yu-Chi Su; 蘇郁琪

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62711

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳良基
dc.contributor.author	Yu-Chi Su	en
dc.contributor.author	蘇郁琪	zh_TW
dc.date.accessioned	2021-06-16T16:08:10Z	-
dc.date.available	2014-06-21
dc.date.copyright	2013-06-21
dc.date.issued	2013
dc.date.submitted	2013-05-29
dc.identifier.citation	[1] J. Borenstein and I. Ulrich, 'The guidecane-a computerized travel aid for the active guidance of blind pedestrians,' in Proc. of International Conference on Robotics and Automation(ICRA), 1997, pp. 1283-1288. [2] D. Clark-Carter A. Dodds and C. Howarth, 'The sonic path finderan evaluation,' J. Vis. Impairment Blindness, vol. 78, pp. 206-207, 1984. [3] B. Laurent and T. N. A. Christian, 'A sonar system modeled after spatial hearing and echolocating bats for blind mobility aid,' International Journal of Physical Sciences, vol. 2, pp. 104-111, 2007. [4] N. A. Ali J. M. Benjamin and A. F. Schepis, 'A laser cane for the blind,' in Proc. of the San Diego Biomedical Symposium, 1973, pp.53-57. [5] D. Yuan and R. Manduchi, Dynamic environment exploration using a virtual white cane,' in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), 2005. [6] J. P. Gragna M. Guzzo G. Costa, A. Gusberti and O. Nasisi, 'Mobility and orientation aid for blind persons using artifcial vision,' Journal of Physics, 2007. [7] G. Klein and D. Murray, 'Parallel tracking and mapping for small ar workplace,' in Proc. of IEEE International Conference on International Symposium on Mixed and Augmented Reality(ISMAR), 2007. [8] H. Bingrong L. Maohai and L. Ronghua, 'Novel mobile robot simultaneous localization and mapping using raoblackwellised particle filter,' International journal of Advanced Robotics Systems, vol. 3, pp. 231-238, 2006. [9] N.Molton and et al., 'Robotic sensing for the partially sighted,' in Proc. of IEEE International Conference on Robotics and Autonomous Systems(ICRA), 1999, pp. 185-201. [10] D. Koller M.Montemerlo, S. Thrun and B.Weigbreit, 'Fast-slam 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges,' International Joint Conference on Articial Intelligence, pp. 1151-1156, 2003. [11] D. Nister M. Pollefeys and et al, 'Detailed real-time urban 3d reconstruction from video,' International Journal of Computer Vision, 2007. [12] J. M. Saez and F. Escolano, 'Stereo-based aerial obstacle detection for the visually impaired,' in Proc. of IEEE European Conference on Computer Vision(ECCV), 2008. [13] F. Escolano J. M. Saez and A. Penalver, 'First steps towards stereo-based 6dof slam for the visually impaired,' in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), 2005. [14] D. Dakopoulos and N.G. Bourbakis, 'Wearable obstacle avoidance electronic travel aids for blind: A survey,' IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, pp. 25-35, 2010. [15] T. Sasaki T. Ifukube and C. Peng, 'A blind mobility aid modeled after echolocation of bats,' IEEE Transactions on Biomedical Engineering,vol. 382, pp. 461-465, 1991. [16] P. B. L. Meijer, 'An experimental system for auditory image representations,' IEEE Transactions on Biomedical Engineering, vol. 39, pp. 112-121, 1992. [17] J. Diepstraten A. Hub and T. Ertl, 'Design and development of an indoor navigation and object identification system for the blind,' in Proc. of ACM SIGACCESS Accessibility Computing, 2003, pp. 77-78. [18] M. Choudhury D. Aguerrevere and A. Barreto, 'Portable 3d sound /sonar navigation system for blind individuals,' in Proc. of 2nd Latin American and Caribbean Conference for Engineering and Technology,2004, pp. 2-4. [19] J. L. Gonzalez-Mora et al., 'Development of a new space perception system for blind people, based on the creation of a virtual acoustic space,' Tech. Rep. http://www.iac.es/proyect/eavi, 2009. [20] R. Nagarajan G. Sainarayanan and S. Yaacob, 'Fuzzy image processing scheme for autonomous navigation of human blind,' Applied Soft Computing, vol. 7, pp. 257-264, 2007. [21] C. Dunk R. Audette, J. Balthazaar and J. Zelek, 'A stereo-vision system for the visually impaired,' Tech. Rep., 2000. [22] I. Ulrich and J. Borenstein, 'The guidecanevapplying mobile robot technologies to assist the visually impaired people,' IEEE Transactions on Systems, vol. 31, pp. 131-136, 2001. [23] S. Meers and K. Ward, 'A substitute vision system for providing 3d perception and gps navigation via electro tactile stimulation,' in Proc. of International Conference on Sensor Technology, 2005, pp. 21-23. [24] J. Akita T. Ono I. Gyobu T. Tagaki T. Hoshi K. Ito, M. Okamoto and Y. Mishima, 'Cyarm: An alternative aid device for blind persons,' in Proc. of ACM Conference on Human Factors in Computing Systems,2005, pp. 1483-1488. [25] K. J. De Laurentis M. Bouzit, A. Chaibi and C. Mavroidis, 'Tactile feedback navigation handle for the visually impaired,' in Proc. of International Mechanical Engineering Congress and Exposition, 2004,pp. 13-19. [26] M. Youssef C. Shah, M. Bouzit and L. Vasquez, 'Evaluation of runetra v tactile feedback navigation system for the visually impaired,' in Proc. of International Workshop Virtual Rehabil, 2006, pp. 71-77. [27] L. A. Johnson and C. M. Higgins, 'A navigation aid for the blind using tactile-visual sensory substitution,' in Proc. of IEEE International Conference on Engineering in Medicine and Biology Society, 2006, pp. 6298-6292. [28] N. G. Bourbakis and D. Kavraki, 'Intelligent assistants for handicapped peoples independence: Case study,' in Proc. of IEEE International Joint Symposium Intelligent System, 1996, pp. 337-344. [29] D. Dakopoulos and S. K. Boddhu, 'A 2d vibration array as an assistive device for visually impaired,' in Proc. of IEEE International Conference on Bioinformatics and Bioengineering, 2007, pp. 930-937. [30] D. Dakopoulos and N. Bourbakis, 'Preserving visual information in low resolution images during navigation for visually impaired,' in Proc. of International Conference PErvasive Technololgy Related to Assistive Environments, 2008. [31] M. Adjouadi, 'A man-machine vision interface for sensing the environment,' Journal of Rehabilitation Research and Development, vol. 29, pp. 56-57, 1992. [32] D. Yuan and R. Manduchi, 'A tool for range sensing and environment discovery for the blind,' Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, 2004. [33] S. Shoval, J. Borenstein, and Y. Koren, 'Auditory guidance with the navbelt - a computerized travel aid for the blind,' IEEE Transactions on Systems, Man, and Cybernetics, vol. 28, pp. 459-467, 1998. [34] J. A. Hesch, G. L. Mariottini F. M. Mirzaei, and S. I. Roumeliotis, 'A 3d pose estimator for the visually impaired robotic navigation aids for the visually impaired,' in Proc. of International Conference on Intelligent Robots and Systems (IROS), 2009, pp. 2716-2723. [35] Dennis Hong, 'http://robots.net/article/3185.html,' 2011. [36] Wicab BrainPort, 'http://vision.wicab.com/index.php,' . [37] Second Sight, 'http://2-sight.eu/en/home-en,' . [38] V. Pradeep, G. Medioni, and J. Weiland, 'Robot vision for the visually impaired,' in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2010, pp. 15-22. [39] WalkyTalky, 'http://en.wikipedia.org/wiki/walkie-talkie,' . [40] I. Biederman, 'Recognition-by-components: A theory of human image understanding,' Psychological Review, vol. 94, pp. 115-147., 1987. [41] C. Harris and M. Stephens, 'A compined corner and edge detector,' in Proc. of Alvey Vision Conference, 1988, pp. 147-151. [42] D. G. Lowe, 'Distinctive image features from scale-invariant key-points,' International Journal of Computer Vision, vol. 60, pp. 91-110, 2004. [43] Y. Ke and R. Sulkthankar, 'Pca-sift: A more distinctive representation for local image descriptors,' in Proc. of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2004, pp. 506-513. [44] H. Bay, T. Tuytelaars, and L. V. Gool, 'Surf: Speeded up robust features,' in Proc. of Euporean Conference on Computer Vision(ECCV), 2006, pp. 404-417. [45] E. Rosten and T. Drummond, 'Machine learning for high-speed corner detection,' in Proc. of European Conference on Computer Vision(ECCV), 2006, pp. 430-443. [46] K. Mikolajczyk and C. Schmid, 'A performance evaluation of local descriptors,' IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 10, pp. 1615-1630, 2005. [47] T. Lindeberg, 'Feature detection with automatic scale selection,' International Journal of Computer Vision, vol. 30, pp. 79-116, 1998. [48] C. Schmid K. Mikolajczyk, 'An ane invariant interest point detector,' in Proc. of IEEE European Conference on Computer Vision, 2002, pp. 128-142. [49] T. Tuytelaars, 'Wide baseline stereo based on local, affinely invariant regions,' in Proc. of British Machine Vision Conference, 2000, pp. 412-422. [50] T. Pajdla J.Matas, O. M. U. Chum, 'Robust wide baseline stereo from maximally stable extremal regions,' in Proc. of British Machine Vision Conference, 2002, pp. 384-393. [51] C. Schmid K. Mikolajczyk, 'A performance evaluation of local descriptors,' Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, 2003. [52] C. Schmid A. Zisserman J. Matas F. Schaalitzky T. Kadir L. Van Gool K. Mikolajczyk, T. Tuytelaars, 'A comparison of affine region detectors,' International Journal of Computer Vision, vol. 65, pp. 43-72, 2005. [53] D. Wagner, G. Reitmayr, A. Mulloni, T. Drummond, and D. Schmalstieg, 'Pose tracking from natural features on mobile phones,' in Proc. of International Symposium on Mixed and Augmented Reality(ISMAR), 2008, pp. 125-134. [54] D. Ta, W.C.Chen, N. Gelfand, and K. Pulli, 'Surftrac: Efficient tracking and continuous object recognition using local feature descriptors,' in Proc. of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2009, pp. 2937-2944. [55] 'A parallel hardware architecture for scale and rotation invariant feature detection,' IEEE Transaction on Circuits and Systems for Video Technology, vol. 18, pp. 1703-1712, 2008. [56] J.Y. Kim, M. Kim, S. Lee, J. Oh, K. Kim, and H.J.Yoo, 'A 201.4 gops 496 mw real-time multi-object recognition processor with bio-inspired neural perception engine,' IEEE Journal of Solid-State Circuits, vol. 45, pp. 32-45, 2010. [57] S. Lee, J. Oh, J.Park, J.Kwon, M.Kim, and H.J. Yoo, 'A 345 mw heterogeneous many-core processor with an intelligent inference engine for robust object recognition,' IEEE Journal of Solid-State Circuits, vol. 46, pp. 42-51, 2011. [58] Y.C. Su, K.Y. Huang, T.W. Chen, Y.M. Tsai, S.Y. Chien, and L.G.Chen, 'A 52mw full hd 160-degree object viewpoint recognition soc with visual vocabulary processor for wearable vision applications,' in Proc. of IEEE Symposium VLSI, 2011, pp. 258-259. [59] Y.C. Su, K.Y. Huang, T.W. Chen, Y.M. Tsai, S.Y. Chien, and L.G. Chen, 'A 52mw full hd 160-degree object viewpoint recognition soc with visual vocabulary processor for wearable vision applications,' IEEE Journal of Solid-State Circuits, 2012. [60] J. Beis and D.G. Lowe, 'Shape indexing using approximate nearest-neighbour search in high-dimensional spaces,' in Prof. of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 1997, pp. 1000-1006. [61] M. Hiromoto, H.Sugano, and R.Miyamoto, 'Partially parallel architecture for adaboost-based detection with haar-like features,' IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, pp. 41-52, 2009. [62] K. Mizuno, H. Noguchi, G. He, Y. Terachi, T.Kamino, H. Kawaguchi, and M. Yoshimoto, 'Fast and low-memory-bandwidth architecture of sift descriptor generation with scalability on speed and accuracy for vga video,' in Prof. International Conference on Field Programmable Logic and Applications, 2010, pp. 608-611. [63] D. Kim, K.Kim, J.Y. Kim, S. Lee, and H.J. Yoo, 'An 81.6 gops object recognition processor based on noc and visual image processing memory,' in Proc. of IEEE Conference on Custom Integrated Circuits Conference,, 2007, pp. 443-446. [64] A. Abbo, R. Kleihorst, V. Choudhary, L. Sevat, P. Wielage, S. Mouy, and M. Heijligers, 'Xetal-ii: A 107 gops, 600 mw massively-parallel processor for video scene analysis,' IEEE Journal of Solid-State Circuits, vol. 43, pp. 192-201, Jan. 2008. [65] K. Kim, S. Lee, J.Y. Kim, Minsu Kim, and H.J.Yoo, 'A 125 gops 583 mw network-on-chip based parallel processor with bio-inspired visual attention engine,' IEEE Journal of Solid-State Circuits, vol. 44, pp. 136-147, 2009. [66] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, 'Visual categorization with bags of keypoints,' in Proc. of IEEE European Conference on Computer Vision Workshop on Statistical Learning in CV, 2004, pp. 59-74. [67] J. Yang, Y. G. Jiang, A. Hauptmann, and C. W. Ngo, 'Evaluating bag-of-words representations in scene classification,' in Proc. of ACM International Conference on Multimedia Information Retrieval(MIR), 2007, pp. 197-206. [68] D. Nister and H. Stewenius, 'Scalable recognition with a vocabulary tree,' in Proc. of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2006, pp. 2161-2168. [69] D.H. Ballard, 'Gneralizing the hough transform to detect arbitrary shapes,' Pattern Recognition, vol. 13, pp. 111-122, 1981. [70] T. W. Chen and S. Y. Chien, 'Flexible hardware architecture of hierarchical k-means clustering for large cluster number,' IEEE Transactions on Very Large Scale Integration Systems, vol. 19, pp. 1336-1345,2011. [71] A. Andoni and P. Indyk, 'Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,' in Proc. of IEEE Foundations of Computer Science(FOCS), 2006, pp. 459-468. [72] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, 'Locality-sensitive hashing scheme based on p-stable distributions,' in Proc. of Symposium on Computational Geometry (SCG), 2004, pp. 253-261. [73] K. Y. Lee, Y. Y. Chuang, B. Y. Chen, and M. Ouhyoung, 'Video stabilization using robust feature trajectories,' in Proc. of IEEE International Conference on Computer Vision(ICCV), 2009, pp. 1379-1404. [74] Bay Advanced Technologies Ltd., 'http://www.batforblind.co.nz/index.htm,'. [75] BestPlutonWorld Cie., 'http://bestpluton.free.fr/englishminiradar.htm,'. [76] Australia. GDP Research, 'http://www.gdpresearch.com.au/,' . [77] Nurion-Raycal, 'http://www.lasercane.com/,' . [78] A. Elfes, 'Sonar-based real-world mapping and navigation,' IEEE Journal of Robotics and Automation, vol. 3, pp. 249-265, 1987. [79] B. Leibe A. Ess and L. van Gool, 'Depth and appearance for mobile scene analysis,' Proc. of IEEE International Conference on Computer Vision, 2007. [80] D. M. Gavrila and S. Munder, 'Multi-cue pedestrian detection and tracking from a moving vehicle,' International Journal of Computer Vision, vol. 73, pp. 41-59, 2007. [81] N. Cornelis L. Van B. Leibe, K. Schindler and Gool. Coupled, 'Coupled detection and tracking from static cameras and moving vehicles,' IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 30, pp. 1683-1698, 2008. [82] J. McRaven M. Scheutz and G. Cserey, 'Fast, reliable, adaptive, bimodal people tracking for indoor environments,' in Proc. of International Conference on Intelligent Robots and Systems, 2004, pp. 412-422. [83] D. Fox D. Schulz, W. Burgard, 'People tracking with mobile robots using sample-based joint probabilistic data association lters,' International Journal of Robotics Research, vol. 22, pp. 99-116, 2003. [84] C. Thorpe C.-C. Wang and S. Thrun, 'Online simultaneous localization and mapping with detection and tracking of moving objects: Theory and results from a ground vehicle in crowded urban areas,' in Proc. of International Conference on Robotics and Automation, 2003. [85] B. Wu and R. Nevatia, 'Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet part detectors,' International Journal of Computer Vision, vol. 75, pp. 247-266, 2007. [86] Y. Li L. Zhang and R. Nevatia, 'Global data association for multiobject tracking using network flows,' Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, 2008. [87] C. Long, B. Guo, and W. Sun, 'Obstacle detection system for visually impaired people based on stereo vision,' International Conference on Genetic and Evolutionary Computing, pp. 723-726, 2010. [88] Z. Yankun, C. Hong, and N. Weyrich, 'A single camera based rear obstacle detection system,' in Proc. of IEEE Intelligent Vehicles Symposium, 2011, pp. 485-490. [89] O. Arif and P.and Teizer J.and Stewart J. Daley, W.and Vela, 'Visual tracking and segmentation using time-of- flight sensor,' in Proc. of IEEE International Conference on Image Processing(ICIP), 2010, pp. 2241-2244. [90] S. Meers and K. Ward, 'Face recognition using a time-of-flight camera,' in Proc. of International Conference on Computer Graphics, Imaging and Visualization, 2009, pp. 377-382. [91] P. Gemeiner, P. Jojic, and M. Vincze, 'Selecting good corners for structure and motion recovery using a time-of-flight camera,' in Proc. of International Conference on Intelligent Robots and Systems, 2009, pp. 5711-5716. [92] C.D. Pantilie, S. Bota, I. Haller, and S. Nedevschi, 'Real-time obstacle detection using dense stereo vision and dense optical flow,' in IEEE International Conference on Intelligent Computer Communication and Processing, 2010, pp. 191-196. [93] P. Jeong and S. Nedevschi, 'Obstacle detection based on the hybrid road plane under the weak calibration conditions,' in Proc. of IEEE Intelligent Vehicles Symposium, 2008, pp. 446-451. [94] D. Pfeier and U. Franke, 'Efficient representation of traffic scenes by means of dynamic stixels,' in Proc. of Intelligent Vehicles Symposium, 2010, pp. 217-224. [95] A. Ess, B. Leibe, K. Schindler, and L. van Gool, 'Moving obstacle detection in highly dynamic scenes,' in Proc. of International Conference on Robotics and Automation, 2009, pp. 56-63. [96] H. Zhencheng, F. Lamosa, and K. Uchimura, 'A complete u-v-disparity study for stereovision based 3d driving environment analysis,' in Proc. of International Conference on 3-D Digital Imaging and Modeling, 2005, pp. 204-211. [97] R.Adams and L Bischof, 'Seeded region growing,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, pp. 641-647, June 1994. [98] S. Yamaguchi and T. Tanaka, 'Gps standard positioning using kalman filter,' in International Joint Conference SICE-ICASE, oct. 2006, pp. 1351-1354. [99] Y. He, R. Martin, and A.M. Bilgic, 'Approximate iterative least squares algorithms for gps positioning,' in Proc. of IEEE International Symposium on Signal Processing and Information Technology(ISSPIT), dec. 2010, pp. 231-236. [100] M.R. Mosavi, 'Frequency domain modeling of gps positioning errors,' in Proc. of International Conference on Signal Processing, nov. 2006, vol. 4. [101] M. Dawood, C. Cappelle, M.E. El Najjar, M. Khalil, and D. Pomorski, 'Vehicle geo-localization based on imm-ukf data fusion using a gps receiver, a video camera and a 3d city model,' in Proc. of IEEE Intelligent Vehicles Symposium, june 2011, pp. 510-515. [102] A. Hakeem, R. Vezzani, M. Shah, and R. Cucchiara, 'Estimating geospatial trajectory of a moving camera,' in Proc. of International Conference on Pattern Recognition, 2006, vol. 2, pp. 82-87. [103] F. Lin, B.M. Chen, and T.H. Lee, 'Vision aided motion estimation for unmanned helicopters in gps denied environments,' in IEEE Conference on Cybernetics and Intelligent Systems(CIS), june 2010, pp. 64-69. [104] S. Wess and K. Altho and G. Derwand, 'Using k-d trees to improve the retrieval step in case-based reasoning,' in Lecture Notes in Computer Science, 1994, vol. 837, pp. 167-181. [105] M.A. Fischler and R.C. Bolles, 'Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,' in Communications of the ACM, June 1981, vol. 24, pp. 381-395.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62711	-
dc.description.abstract	隨著視覺技術發展日新月異，未來以電腦視覺技術為基礎之電子導盲輔具來協助人類的生活的時代指日可待。對視障朋友來說，目前最普遍使用的導盲輔具為導盲犬和白手杖。然而，導盲犬和白手杖無論在社交的場合或是擁擠的地方對盲人來說皆十分地不方便，並且其所能感知週遭物體的訊息和範圍十分有限。相較之下，以視覺技術為基礎之電子輔助導航工具則可以感測完整的動態環境資訊，並提供豐富的視覺訊息給使用者。先前相關之視覺導盲技術無論在辨識效果及使用者介面上仍存在大幅改進的空間。例如偵測室外環境超過30公尺遠之物體、在動態環境中辨識及追蹤快速移動物體、以及提供完整的障礙物偵測及可走路徑資訊以滿足盲人的實際需求。在本論文中，我們提出以電腦視覺技術為基礎之可穿戴式電子導盲辨識及導航系統。我們的系統旨在協助視障朋友可在日常生活中自主且安全地行動。在論文的第一部份，我們開發了一個利用尺度及視角不變性技術為基礎之視覺辨識系統，讓視障朋友可以輕鬆地在室外或室內尋找物體，例如馬路上的商店招牌或交通標誌等，以達到自主行動之目的。同時，我們也設計了高解析度之尺度及視角不變性之視覺辨識系統晶片，以驗證我們系統之高運算效率及辨識準確率。除了協助視障朋友辨識所欲尋找之物件外，安全且獨立的行動亦是他們生活中最重要的事情。在論文的第二部份中，我們設計了一個即時的動態交通資訊分析系統，此系統目的在於提供視障朋友安全的導航協助。例如，本系統可自動偵測盲人前方之靜態障礙物、預測動態障礙物之行動軌跡、以及計算盲人可走之安全路徑。除了即時交通路況資訊分析外，近年來GPS (Global Positioning System)技術也逐漸被使用於各種型態的導航裝置上，可協助導航行車於不熟悉的區域。為了可以使視障朋友同樣在一個陌生的環境裡去自己想去的地方，在本論文之第三部份，我們設計了一個結合視覺辨識技術和GPS功能之進階導航系統，目的為替盲人提供更精準之行動定位，以提供可靠且安全之即時導航。在論文的第一部分，我們提出一個尺度及視角不變之視覺辨識系統，目的在讓視障朋友行走於戶外環境時，也可以即時偵測所欲尋找之物體，例如店家之招牌或馬路上的交通號誌等。相較於先前相關技術，我們系統主要有下面三大貢獻。首先，我們提出的視覺辨識系統可支援1920x1080高畫質解析度影像，可清晰辨識戶外空間下遠處之物體。第二，為了解決實際於動態環境下辨識物體時，因所辨識之物體可能以不同角度出現於使用者眼前，或是穿戴式系統於使用者行走時易因頭部晃動造成鏡頭拍攝畫面不穩定而造成實際辨識率降低之問題，在本系統中我們提出了160度物體視角不變性之預測技術，以實現在複雜之動態環境下仍達成高度的物體辨識率。最後，我們設計了視覺語彙處理器（VVP）以加速視覺辨識系統中物件匹配之速度。相較於現今之視覺辨識系統於物件匹配階段時需大量的記憶體存取動作，我們提出的系統可降低75%的記憶體存取頻寬。為了驗證我們所提出的系統可達成高度的物體辨識率及運算效率，我們以CMOS 65奈米技術設計了一個可支援1920x1080高解析度影像處理及160度物體視角預測之視覺辨識系統晶片，對於50公尺遠之物體可高達94%以上之辨識率，且平均只需52mW之低電力消耗。在論文的第二部分，我們提出一個即時動態交通資訊分析系統以協助偵測盲人周遭的危險障礙物。我們的系統支援完整的資訊分析，包括計算可行走空間、樓梯偵測、靜態障礙物偵測、動態物體之軌跡預測及可走路徑規畫。我們的系統可以輸出明確的行動指令給盲人，並根據盲人的行走速度動態調整運作速率。除此之外，我們也針對可穿戴式電子設備上因為使用者頭部晃動造成攝影鏡頭不穩，而影響路徑判斷的議題提出相對應的可走空間計算之演算法。我們的系統是基於使用立體視覺的3D攝影機產生的深度影像作障礙物偵測。在室內環境下，平均可達障礙物偵測率96%。即使是在室外環境，系統的平均障礙物偵測率為93%。在本論文之第三部份，我們提出了結合視覺辨識技術和GPS功能的進階導航系統，目的為替盲人提供更精準之行動定位，協助他們也可以使用GPS去想去的地方。我們提出的系統利用GPS地圖上的已有之商店資訊或建築標誌，在實際使用者在街景上行走時遇到該類標誌時，系統可根據辨識到標誌的距離和方位來作更精準之使用者定位。根據實驗結果，相較於現今的GPS系統平均計算使用者位置上達20公尺之誤差，我們所提出的系統僅有平均0.97公尺之誤差，可提供更可靠之GPS導航功能。未來我們將整合我們提出的尺度及視角不變之視覺辨識系統、即時動態交通資訊分析系統、及視覺辨識輔助之GPS導航系統於一個高效率運算系統晶片。我們計劃在不久的將來，我們的系統可嵌入於眼鏡上以協助視障朋友重新體驗世界，過著自主、獨立、且自信的生活。	zh_TW
dc.description.abstract	Recently, vision technologies are gradually introduced to electronic aids to assist elder people or the visually-impaired. The most commonly used mobility tools - guide dogs and white canes, are inconvenient and expensive, and have limited usability in recognizing surrounding objects. Electronic vision-based visually impaired aids are smarter as navigation tools that can perceive rich visual information of the environment for the user. However, state-of-the-arts are too bulky and still reveal many limitations such as detecting distant objects in outdoor environments, keeping high recognition accuracy even in challenging situations, and supporting complete functionalities to meet real requirements of blind persons. In the thesis, we design a wearable vision-based visually impaired recognition and navigation system for the visually impaired. The system aims to assist blind persons from basic needs to advanced requirements in their lives. First, we develop an invariant visual recognition system that allows blind persons easily finding things in indoor environment or looking for store signs, traffic signs, or important landmarks outdoors. A high definition invariant visual recognition SoC is realized to verify high efficiency and accuracy of our vision recognition design. Then, to provide safe travel aid for the blind, we propose a robust traffic information analysis system to keep them from dangers such as front obstacles, moving cars or pedestrians. We also design a GPS-based visual navigation guide that combines recognition technology and the GPS function to provide higher accurate positioning result for the blind so that they can navigate independently even in an unfamiliar environment. In the first part of the thesis, we develop an invariant visual recognition system that aim to recognize distant objects such as logo or traffic sighs outdoors. To overcome shortages of state-of-art works, three prominent characteristics are introduced in our system. First, usage of full HD resolution with 30fps achieves high resolution required for recognition of distant objects. Second, to achieve high recognition rate even under challenging conditions such as dramatic object viewpoint change of severe camera shake, we adopt a 160-degree viewpoint invariance prediction technique in our system to realize high recognition rate. Lastly, unique design of visual vocabulary processor(VVP) speeds up the matching stage. Compared to existing object recognition systems that operate object matching with frequent memory accesses, the proposed VVP reduces 75\% of memory bandwidth. A wearable 1920$ imes$1080 160-degree object viewpoint recognition SoC is realized on a 6.38mm2 die with 65nm CMOS technology. We design a highly integrated chip for the system with emphasis on high efficiency and energy saving issues. The system accomplishes 94\% under full HD resolution for a 50m-far traffic light while only consumes 52mW on average. In the second part of the thesis, we present a robust traffic information analysis system. The system aims to assist the blind in detecting obstacles with distance information for safety. The system supports complete detection functions such as road calculation, stair, wall, and other static obstacle detection, moving object trajectory prediction and path suggestion. The system outputs explicit and simple instructions to the blind with user-involved feedback mechanism based on user walking speed. We also address fundamental problems for wearable vision applications that usually have severe camera motion on wearable devices. A depth-based obstacle extraction mechanism is also proposed to capture obstacles according to various object proprieties revealing in the depth map. In the indoor environment, the average detection rate is above 96.1\%. Even in the outdoor environment or in complete darkness, 93.7\% detection rate is achieved on average. In the third part of the thesis, an accurate and robust positioning system based on street view recognition is introduced. Vision-based technique is employed for dynamically recognizing shop or building signs on the GPS map. Two mechanisms including view-angle invariant distance estimation and path refinement are proposed for robust and accurate position estimation. Through the combination of visual recognition technique and GPS scale data, the real user location can be accurately inferred. Experimental results demonstrate that the proposed system is reliable and feasible. Compared with 20m error of position estimation provided by the GPS, the system only has 0.97m error estimation. In the near future, we plan to integrate the whole system including invariant vision recognition, traffic information analysis, and GPS-based visual navigation guide into a chip. The system is expected to be embedded in glasses to help blind persons experience the world again. In addition, our investigations can led to the systematic development of new computer vision technology.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T16:08:10Z (GMT). No. of bitstreams: 1 ntu-102-D96921032-1.pdf: 29539703 bytes, checksum: 920882bdf1f52ca5991d913553ae3a40 (MD5) Previous issue date: 2013	en
dc.description.tableofcontents	Contents Abstract xiii 1 Introduction 1 1.1 Trend of Visual Technology 1 1.2 Visually Impaired Aids 2 1.2.1 Traditional Travel Aid 4 1.2.2 Electronic Travel Aid 5 1.3 Main Contributions 10 1.4 Dissertation Organization 15 I Invariant Visual Recognition System 17 2 Background Knowledge of Visual Recognition 19 2.1 Introduction 19 2.2 Current Object Recognition Methods 21 2.2.1 Local Feature Descriptors 21 2.2.2 FAST Detector 23 2.2.3 SIFT 24 2.2.4 Feature Matching 29 2.3 State-of-The Art Object Recognition Platform 30 2.4 Challenges of Object Recognition 33 2.5 Summary 34 3 System Overview 37 3.1 Introduction 37 3.2 System Flow 40 3.3 Human-Centered Design Mechanism 41 3.4 Visual Vocabulary Processor 45 3.5 Summary 48 4 Architecture 49 4.1 System Architecture 49 4.2 Architecture of FMP 51 4.3 Architecture of FE 54 4.4 Architecture of HCD 56 4.5 Architecture of VVP 59 4.6 Summary 62 5 Chip Implementation 63 5.1 Introduction 63 5.2 Chip Comparison 63 5.3 Evaluation 66 II Robust Traffic Information Analysis System 71 6 Robust Traffic Information Analysis 73 6.1 Introduction 73 6.2 Related Work 76 6.3 System Flow 77 6.4 The Depth-based Obstacle Detection Framework 80 6.4.1 3D-sensor 80 6.4.2 Road Calculation 81 6.4.3 Obstacle Extraction 83 6.4.4 Trajectory Calculation 87 6.5 Experimental Results 88 6.6 Summary 90 III GPS-Based Visual Navigation Guide 95 7 Intelligent GPS-Based Navigation 97 7.1 Introduction 97 7.2 Related Work 100 7.3 Positioning Reallocation 100 7.3.1 Approach Overview 100 7.3.2 Street View Recognition 101 7.4 Experimental Results 105 7.5 Summary 108 8 Conclusion 109 8.1 Principle Contributions 109 8.2 Future Directions 113 8.2.1 Visual Recognition with Online Learning 113 8.2.2 Integrated Chip 113 Bibliography 115
dc.language.iso	en
dc.subject	物件辨識	zh_TW
dc.subject	穿戴式設計	zh_TW
dc.subject	多媒體處理	zh_TW
dc.subject	硬體架構	zh_TW
dc.subject	數位電路	zh_TW
dc.subject	數位電路	zh_TW
dc.subject	硬體架構	zh_TW
dc.subject	多媒體處理	zh_TW
dc.subject	物件辨識	zh_TW
dc.subject	即時處理	zh_TW
dc.subject	系統晶片	zh_TW
dc.subject	系統晶片	zh_TW
dc.subject	穿戴式設計	zh_TW
dc.subject	即時處理	zh_TW
dc.subject	wearable applications	en
dc.subject	Digital circuit	en
dc.subject	hardware architecture	en
dc.subject	multimedia processing	en
dc.subject	object recognition	en
dc.subject	real-time processing	en
dc.subject	system-on-chip (SoC)	en
dc.subject	wearable applications	en
dc.subject	Digital circuit	en
dc.subject	hardware architecture	en
dc.subject	multimedia processing	en
dc.subject	object recognition	en
dc.subject	real-time processing	en
dc.subject	system-on-chip (SoC)	en
dc.title	針對盲人輔助應用之視覺辨識及導航系統的演算法及架構設計	zh_TW
dc.title	Algorithm and Architecture Design of Visual Recognition and Navigation for the Visually Impaired	en
dc.type	Thesis
dc.date.schoolyear	101-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	蔡宗漢,賴永康,盧奕彰,馬仕毅,陳美娟
dc.subject.keyword	數位電路,硬體架構,多媒體處理,物件辨識,即時處理,系統晶片,穿戴式設計,	zh_TW
dc.subject.keyword	Digital circuit,hardware architecture,multimedia processing,object recognition,real-time processing,system-on-chip (SoC),wearable applications,	en
dc.relation.page	128
dc.rights.note	有償授權
dc.date.accepted	2013-05-29
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	28.85 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。