針對盲人輔助應用之視覺辨識及導航系統的演算法及架構設計

Yu-Chi Su; 蘇郁琪

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62711

標題:	針對盲人輔助應用之視覺辨識及導航系統的演算法及架構設計 Algorithm and Architecture Design of Visual Recognition and Navigation for the Visually Impaired
作者:	Yu-Chi Su 蘇郁琪
指導教授:	陳良基
關鍵字:	數位電路,硬體架構,多媒體處理,物件辨識,即時處理,系統晶片,穿戴式設計, Digital circuit,hardware architecture,multimedia processing,object recognition,real-time processing,system-on-chip (SoC),wearable applications,
出版年 :	2013
學位:	博士
摘要:	隨著視覺技術發展日新月異，未來以電腦視覺技術為基礎之電子導盲輔具來協助人類的生活的時代指日可待。對視障朋友來說，目前最普遍使用的導盲輔具為導盲犬和白手杖。然而，導盲犬和白手杖無論在社交的場合或是擁擠的地方對盲人來說皆十分地不方便，並且其所能感知週遭物體的訊息和範圍十分有限。相較之下，以視覺技術為基礎之電子輔助導航工具則可以感測完整的動態環境資訊，並提供豐富的視覺訊息給使用者。先前相關之視覺導盲技術無論在辨識效果及使用者介面上仍存在大幅改進的空間。例如偵測室外環境超過30公尺遠之物體、在動態環境中辨識及追蹤快速移動物體、以及提供完整的障礙物偵測及可走路徑資訊以滿足盲人的實際需求。在本論文中，我們提出以電腦視覺技術為基礎之可穿戴式電子導盲辨識及導航系統。我們的系統旨在協助視障朋友可在日常生活中自主且安全地行動。在論文的第一部份，我們開發了一個利用尺度及視角不變性技術為基礎之視覺辨識系統，讓視障朋友可以輕鬆地在室外或室內尋找物體，例如馬路上的商店招牌或交通標誌等，以達到自主行動之目的。同時，我們也設計了高解析度之尺度及視角不變性之視覺辨識系統晶片，以驗證我們系統之高運算效率及辨識準確率。除了協助視障朋友辨識所欲尋找之物件外，安全且獨立的行動亦是他們生活中最重要的事情。在論文的第二部份中，我們設計了一個即時的動態交通資訊分析系統，此系統目的在於提供視障朋友安全的導航協助。例如，本系統可自動偵測盲人前方之靜態障礙物、預測動態障礙物之行動軌跡、以及計算盲人可走之安全路徑。除了即時交通路況資訊分析外，近年來GPS (Global Positioning System)技術也逐漸被使用於各種型態的導航裝置上，可協助導航行車於不熟悉的區域。為了可以使視障朋友同樣在一個陌生的環境裡去自己想去的地方，在本論文之第三部份，我們設計了一個結合視覺辨識技術和GPS功能之進階導航系統，目的為替盲人提供更精準之行動定位，以提供可靠且安全之即時導航。在論文的第一部分，我們提出一個尺度及視角不變之視覺辨識系統，目的在讓視障朋友行走於戶外環境時，也可以即時偵測所欲尋找之物體，例如店家之招牌或馬路上的交通號誌等。相較於先前相關技術，我們系統主要有下面三大貢獻。首先，我們提出的視覺辨識系統可支援1920x1080高畫質解析度影像，可清晰辨識戶外空間下遠處之物體。第二，為了解決實際於動態環境下辨識物體時，因所辨識之物體可能以不同角度出現於使用者眼前，或是穿戴式系統於使用者行走時易因頭部晃動造成鏡頭拍攝畫面不穩定而造成實際辨識率降低之問題，在本系統中我們提出了160度物體視角不變性之預測技術，以實現在複雜之動態環境下仍達成高度的物體辨識率。最後，我們設計了視覺語彙處理器（VVP）以加速視覺辨識系統中物件匹配之速度。相較於現今之視覺辨識系統於物件匹配階段時需大量的記憶體存取動作，我們提出的系統可降低75%的記憶體存取頻寬。為了驗證我們所提出的系統可達成高度的物體辨識率及運算效率，我們以CMOS 65奈米技術設計了一個可支援1920x1080高解析度影像處理及160度物體視角預測之視覺辨識系統晶片，對於50公尺遠之物體可高達94%以上之辨識率，且平均只需52mW之低電力消耗。在論文的第二部分，我們提出一個即時動態交通資訊分析系統以協助偵測盲人周遭的危險障礙物。我們的系統支援完整的資訊分析，包括計算可行走空間、樓梯偵測、靜態障礙物偵測、動態物體之軌跡預測及可走路徑規畫。我們的系統可以輸出明確的行動指令給盲人，並根據盲人的行走速度動態調整運作速率。除此之外，我們也針對可穿戴式電子設備上因為使用者頭部晃動造成攝影鏡頭不穩，而影響路徑判斷的議題提出相對應的可走空間計算之演算法。我們的系統是基於使用立體視覺的3D攝影機產生的深度影像作障礙物偵測。在室內環境下，平均可達障礙物偵測率96%。即使是在室外環境，系統的平均障礙物偵測率為93%。在本論文之第三部份，我們提出了結合視覺辨識技術和GPS功能的進階導航系統，目的為替盲人提供更精準之行動定位，協助他們也可以使用GPS去想去的地方。我們提出的系統利用GPS地圖上的已有之商店資訊或建築標誌，在實際使用者在街景上行走時遇到該類標誌時，系統可根據辨識到標誌的距離和方位來作更精準之使用者定位。根據實驗結果，相較於現今的GPS系統平均計算使用者位置上達20公尺之誤差，我們所提出的系統僅有平均0.97公尺之誤差，可提供更可靠之GPS導航功能。未來我們將整合我們提出的尺度及視角不變之視覺辨識系統、即時動態交通資訊分析系統、及視覺辨識輔助之GPS導航系統於一個高效率運算系統晶片。我們計劃在不久的將來，我們的系統可嵌入於眼鏡上以協助視障朋友重新體驗世界，過著自主、獨立、且自信的生活。 Recently, vision technologies are gradually introduced to electronic aids to assist elder people or the visually-impaired. The most commonly used mobility tools - guide dogs and white canes, are inconvenient and expensive, and have limited usability in recognizing surrounding objects. Electronic vision-based visually impaired aids are smarter as navigation tools that can perceive rich visual information of the environment for the user. However, state-of-the-arts are too bulky and still reveal many limitations such as detecting distant objects in outdoor environments, keeping high recognition accuracy even in challenging situations, and supporting complete functionalities to meet real requirements of blind persons. In the thesis, we design a wearable vision-based visually impaired recognition and navigation system for the visually impaired. The system aims to assist blind persons from basic needs to advanced requirements in their lives. First, we develop an invariant visual recognition system that allows blind persons easily finding things in indoor environment or looking for store signs, traffic signs, or important landmarks outdoors. A high definition invariant visual recognition SoC is realized to verify high efficiency and accuracy of our vision recognition design. Then, to provide safe travel aid for the blind, we propose a robust traffic information analysis system to keep them from dangers such as front obstacles, moving cars or pedestrians. We also design a GPS-based visual navigation guide that combines recognition technology and the GPS function to provide higher accurate positioning result for the blind so that they can navigate independently even in an unfamiliar environment. In the first part of the thesis, we develop an invariant visual recognition system that aim to recognize distant objects such as logo or traffic sighs outdoors. To overcome shortages of state-of-art works, three prominent characteristics are introduced in our system. First, usage of full HD resolution with 30fps achieves high resolution required for recognition of distant objects. Second, to achieve high recognition rate even under challenging conditions such as dramatic object viewpoint change of severe camera shake, we adopt a 160-degree viewpoint invariance prediction technique in our system to realize high recognition rate. Lastly, unique design of visual vocabulary processor(VVP) speeds up the matching stage. Compared to existing object recognition systems that operate object matching with frequent memory accesses, the proposed VVP reduces 75\% of memory bandwidth. A wearable 1920$ imes$1080 160-degree object viewpoint recognition SoC is realized on a 6.38mm2 die with 65nm CMOS technology. We design a highly integrated chip for the system with emphasis on high efficiency and energy saving issues. The system accomplishes 94\% under full HD resolution for a 50m-far traffic light while only consumes 52mW on average. In the second part of the thesis, we present a robust traffic information analysis system. The system aims to assist the blind in detecting obstacles with distance information for safety. The system supports complete detection functions such as road calculation, stair, wall, and other static obstacle detection, moving object trajectory prediction and path suggestion. The system outputs explicit and simple instructions to the blind with user-involved feedback mechanism based on user walking speed. We also address fundamental problems for wearable vision applications that usually have severe camera motion on wearable devices. A depth-based obstacle extraction mechanism is also proposed to capture obstacles according to various object proprieties revealing in the depth map. In the indoor environment, the average detection rate is above 96.1\%. Even in the outdoor environment or in complete darkness, 93.7\% detection rate is achieved on average. In the third part of the thesis, an accurate and robust positioning system based on street view recognition is introduced. Vision-based technique is employed for dynamically recognizing shop or building signs on the GPS map. Two mechanisms including view-angle invariant distance estimation and path refinement are proposed for robust and accurate position estimation. Through the combination of visual recognition technique and GPS scale data, the real user location can be accurately inferred. Experimental results demonstrate that the proposed system is reliable and feasible. Compared with 20m error of position estimation provided by the GPS, the system only has 0.97m error estimation. In the near future, we plan to integrate the whole system including invariant vision recognition, traffic information analysis, and GPS-based visual navigation guide into a chip. The system is expected to be embedded in glasses to help blind persons experience the world again. In addition, our investigations can led to the systematic development of new computer vision technology.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62711
全文授權:	有償授權
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	28.85 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。