高幀率光流演算法引擎硬體架構設計

Chun-Wei Yu; 游鈞為

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19312

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	陳良基
dc.contributor.author	Chun-Wei Yu	en
dc.contributor.author	游鈞為	zh_TW
dc.date.accessioned	2021-06-08T01:53:12Z	-
dc.date.copyright	2016-07-26
dc.date.issued	2016
dc.date.submitted	2016-07-19
dc.identifier.citation	[1] T. Lan, L. Sigal, and G. Mori, Social roles in hierarchical models for human activity recognition,' in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 1354{1361, June 2012. [2] M. Humphries, Googles new self driving car has no steering wheel.' http://www.geek.com/news/googles-new-self-driving-car-has-no-steering-wheel-1595053/, 2014. [Online; accessed 23-March-2016]. [3] I. del Castillo, Japan has created a robot bear to help nurses take care of their patients.' http://www.lostateminor.com/2015/03/03/japan-created-robot-bear-thatll-help-nurses-take-care-patients/, 2015. [Online; accessed 23-March-2016]. [4] B. van den Hoek, Deep learning: Sky's the limit?.' http://deeplearningskysthelimit.blogspot.tw/2016/04/part-2-alphago-under-magnifying-glass.html, 2016. [Online; accessed 23-March-2016]. [5] N. Dzyre, 10 forthcoming augmented reality & smart glasses you can buy.' http://www.hongkiat.com/blog/augmented-reality-smart-glasses/, 2016. [Online; accessed 23-March-2016]. [6] K. Thomas, internet of things: Everything you need to know.' http://www.3g.co.uk/PR/Feb2015/internet-of-things-everything-you-need-to-know.html, 2015. [Online; accessed 23-March-2016]. [7] Lytro, Light eld.' https://www.lytro.com/, 2011. [Online; accessed 23-March-2016]. [8] Fitbit, Intelligent watch.' http://www.gq.com.tw/gadget/3c/photo-16414-90747.html, 2013. [Online; accessed 23-March-2016]. [9] FLUX, 3d printer.' https://flux3dp.com/, 2014. [Online; accessed 23-March-2016]. [10] LERA, The evolution of 3d tv.' http://lerablog.org/technology/electronics/the-evolution-of-3d-tv/, 2012. [Online; accessed 23-March-2016]. [11] HTC, Augmented reality.' https://www.htcvive.com/tw/, 2016. [Online; accessed 23-March-2016]. [12] Google, Glass.' http://www.google.com/glass/start/, 2014. [Online; accessed 23-March-2016]. [13] H. Wang, A. Kl aser, C. Schmid, and C.-L. Liu, Dense trajectories and motion boundary descriptors for action recognition,' vol. 103, pp. 60{79, 2013. [14] Wiki, Lucaskanade method.' https://en.wikipedia.org/wiki/Lucas%E2%80%93Kanade_method/, 1984. [Online; accessed 23-March-2016]. [15] B. C. Ivan Laptev, Kth-recognition of human actions.' http://www.nada.kth.se/cvap/actions/, 2005. [Online; accessed 23-March-2016]. [16] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, Actions as space-time shapes,' in Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, vol. 2, pp. 1395{1402, IEEE, 2005. [17] M. S. Ryoo and J. K. Aggarwal, UT-Interaction Dataset, ICPR contest on Semantic Description of Human Activities (SDHA).' http://cvrc.ece.utexas.edu/SDHA2010/Human Interaction.html, 2010. [18] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies,' in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1{8, June 2008. [19] J. Yuan, Z. Liu, and Y. Wu, Discriminative subvolume search for efficient action detection,' in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 2442{2449, June 2009. [20] J. Hughes, The ian bishop interview world record holder and beyond.' http://asianmoviepulse.com/2015/04/the-ian-bishop-interview-world-record-holder-and-beyond/, 2015. [Online; accessed 23-March-2016]. [21] N. Anantrasirichai, C. N. Canagarajah, D. W. Redmill, and D. R. Bull, Dynamic programming for multi-view disparity/depth estimation,' in 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol. 2, pp. II{II, May 2006. [22] Y. Boykov, O. Veksler, and R. Zabih, Fast approximate energy minimization via graph cuts,' Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 23, no. 11, pp. 1222{1239, 2001. [23] J. Pearl, Fusion, propagation, and structuring in belief networks,' Artificial intelligence, vol. 29, no. 3, pp. 241{288, 1986. [24] K.-J. Yoon and I. S. Kweon, Adaptive support-weight approach for correspondence search,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp. 650{656, April 2006. [25] T. Lan, T.-C. Chen, and S. Savarese, A hierarchical representation for future action prediction,' in Computer Vision{ECCV 2014, pp. 689{ 704, Springer, 2014. [26] A. A. Efros, A. C. Berg, G. Mori, and J. Malik, Recognizing action at a distance,' in Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pp. 726{733 vol.2, Oct 2003. [27] Y. Yacoob and M. J. Black, Parameterized modeling and recognition of activities,' in Computer Vision, 1998. Sixth International Conference on, pp. 120{127, Jan 1998. [28] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, Behavior recognition via sparse spatio-temporal features,' in 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65{72, Oct 2005. [29] U. Mahbub, H. Imtiaz, and M. A. R. Ahad, An optical flow based approach for action recognition,' in Computer and Information Technology (ICCIT), 2011 14th International Conference on, pp. 646{651, Dec 2011. [30] J. L. Barron, D. J. Fleet, S. S. Beauchemin, and T. A. Burkitt, Performance of optical flow techniques,' in Computer Vision and Pattern Recognition, 1992. Proceedings CVPR '92., 1992 IEEE Computer Society Conference on, pp. 236{242, Jun 1992. [31] C. Tomasi and T. Kanade, Shape and motion from image streams under orthography: a factorization method,' International Journal of Computer Vision, vol. 9, no. 2, pp. 137{154, 1992. [32] B. K. Horn and B. G. Schunck, Determining optical flow,' in 1981 Technical symposium east, pp. 319{331, International Society for Optics and Photonics, 1981. [33] D. Scharstein and R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,' International journal of computer vision, vol. 47, no. 1-3, pp. 7{42, 2002. [34] D. Scharstein, Matching images by comparing their gradient elds,' in Pattern Recognition, 1994. Vol. 1 - Conference A: Computer Vision amp; Image Processing., Proceedings of the 12th IAPR International Conference on, vol. 1, pp. 572{575 vol.1, Oct 1994. [35] D. Terzopoulos, Regularization of inverse visual problems involving discontinuities,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, pp. 413{424, July 1986. [36] V. Kolmogorov, Convergent tree-reweighted message passing for energy minimization,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp. 1568{1583, Oct 2006. [37] M. J. Wainwright, T. S. Jaakkola, and A. S. Willsky, Map estimation via agreement on trees: message-passing and linear programming,' IEEE Transactions on Information Theory, vol. 51, pp. 3697{3717, Nov 2005. [38] C.-C. Cheng, C.-T. Li, C.-K. Liang, Y.-C. Lai, and L.-G. Chen, Architecture design of stereo matching using belief propagation,' in Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pp. 4109{4112, IEEE, 2010. [39] C. K. Liang, C. C. Cheng, Y. C. Lai, L. G. Chen, and H. H. Chen, Hardware-efficient belief propagation,' in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 80-87,June 2009. [40] O. Veksler, Fast variable window for stereo correspondence using integral images,' in Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, vol. 1, pp. I{556{I{561 vol.1, June 2003. [41] A. Fusiello, V. Roberto, and E. Trucco, Efficient stereo with multiple windowing,' in Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on, pp. 858{863, Jun 1997. [42] S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski, A database and evaluation methodology for optical flow,' International Journal of Computer Vision, vol. 92, no. 1, pp. 1{31, 2011. [43] D. J. Fleet and A. D. Jepson, Computation of component image velocity from local phase information,' International journal of computer vision, vol. 5, no. 1, pp. 77{104, 1990. [44] M. Otte and H.-H. Nagel, Optical flow estimation: advances and comparisons,' in Computer VisionECCV'94, pp. 49{60, Springer, 1994. [45] C. J. Willmott and K. Matsuura, Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance,' Climate research, vol. 30, no. 1, p. 79, 2005. [46] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski, A comparison and evaluation of multi-view stereo reconstruction algorithms,' in Computer vision and pattern recognition, 2006 IEEE Computer Society Conference on, vol. 1, pp. 519{528, IEEE, 2006.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19312	-
dc.description.abstract	隨著科技的日新月異，電腦視覺的演進逐漸改變我們生活習慣，像是3D電影的體驗、VR在遊戲及醫療、人才培訓等處開啟了許多應用情境，方便人類的生活，以及類似於Google glass裝置，讓我們在看其他生活周遭的景物時，可以立即獲取相關資訊。我們認為，在未來的科技發展，如利用生活周遭的監視系統可以預測人的動作，進而預防社會案件的發生；VR應用不僅僅只是觀賞，未來會結合與虛擬物件、環境的互動，若有動作預測這項技術，會讓使用者有更流暢的體驗；最後論及到目前正在開發階段的機器人以及無人車駕駛，兩者皆可以藉由外部相機偵測來預測人的行為，偵測到人類快要跌倒或有危險行為即將發生時，可以做處理以避免意外的發生，要完善上述這些未來可能發生的應用情境，精準且快速的動作預測是我們團隊認為未來不可或缺的技術。動作預測又可視為及早的動作辨識，像是不需要完整的動作，我們可以很輕易預測接下來動作的發展，論文第一章講述近年來動作辨識及預測的發展，進而推論出精準度及運算速度為動作預測最大的挑戰，以目前研究來說，精準度逼近百分之百，但在運算速度上，連Real-Time的需求都很難完成，更何況要達到預測所需的規格，其中，又以動作預測演算法中的光流運算最為耗時，占用整個系統一半以上的運算時間，故本篇論文提出「High Frame Rate Optical Flow Engine」，每秒可處理240張HD影像，是目前最高規格的光流演算法硬體加速器，不僅僅遠超過動作預測或辨識所使用的測資，也提供各類需要光流運算的演算法使用。光流運算主要面臨到窺孔問題，會利用到Grouping方式，比對鄰近的像素點以增加信心程度，然而此做法會遇到若比對框中有位移向量不一致的情況，會導致錯誤產生，論文第二章介紹到近年來有非常多演算法被提出以解決此問題，然而這些方式對於硬體來說不易優化，論文第三章比較我們所提出的演算法與其他演算法的差異，於精準度上差異不大，但在實作硬體上卻可以大幅優化。論文第四章介紹硬體架構面積最佳化部份，我們提出運算重複使用、資料重複使用、位元化簡、查表化簡等方式，在精準度稍微降低的情況下，我們的架構使用到的硬體資源降為原本的百分之20，大幅減少製作成本以及耗電量，以滿足可以掛載至移動式裝置上的功能。	zh_TW
dc.description.abstract	Computer vision has been developed for decades, and has totally changed our lives. Thanks to the progress of technologies, we have entered the era of big data and smart devices. There are lots of new technologies, like 3D printers, wearable devices, light field refocus cameras and so on, been invented in recent years. We introduce the technology trends for past five years in Chapter. 1, we think the 'Action Prediction' might be one of computing cores in many applications. Like home care robots, automatic cars, auto surveillance system... The action prediction is irreplaceable that can help robots assist people avoid accident happen. Predicting human action can dramatically reduce the accident rate, like the self-driving cars can promptly stop when detecting human is going to fall in the street. This is the motivation of our thesis, we hope our work can help the current technology move further. Then we survey the researches about action prediction and also pro filing the full system in Chapter. 1. We find that both of accuracy and computing speed are the critical parts to bring this technology into world. However, the full action prediction system may take more 2 minutes for processing. The most time consuming stage is the 'optical flow computing', it takes about at least 1 minute for an VGA image, more than fifty percent of full system. The calculation speed is far from early determining the action as action prediction. So our work is providing a 'High Frame Rate Optical Flow Engine Chip' to accelerate this basic but complicate computation. We introduce the difficulties of optical flow and related works of both of algorithms and architectures. The specification is explored in Chapter. 4, and it is the highest one comparing with other works in recent years. In Chapter. 3, we show the core idea that modifies the original optical flow algorithm to hardware friendly without losing accuracy. We test lots of conditions to ensure the accuracy of modified one is kept as original one. In short, we take the simple filter in the most complicate stage, but use complex filter in other stage to supplement the results. We show the ideas and the details of mapping algorithm to architecture step by step in Chapter. 4, then we optimize the architecture by using the ideas of computing reuse, weight quantization, pipeline structure and bit truncation. The computation reuse is the most influential optimized strategy of all, it reduces the area to 15 percent of original. It is based on the algorithm modification shown in Chapter. 3. The final result is also shown in the thesis; the goal of optimization is to build an area efficient chip for this highly parallel architecture. To sum up, an area efficient high frame rate optical flow engine is designed. It can be used in lots of wearable devices which is low power and low cost requirement. It is also a critical and necessary core for achieving action prediction.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T01:53:12Z (GMT). No. of bitstreams: 1 ntu-105-R02943002-1.pdf: 5246122 bytes, checksum: 01eb54037a5bfdf15f4f314e2f55e2a4 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	The Authorization of Oral Members for Research Dissertation i Acknowledgement iii Abstract in Chinese v Abstract vii Bibliography ix 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Trend of Technologies . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Motivation of Optical Flow Engine . . . . . . . . . . . . . . 6 1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 9 2 Challenges of Optical Flow 11 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Scenario: Search Range Decision, High Frame Rate Application 13 2.3 Overview of Optical Flow Methods . . . . . . . . . . . . . . 15 2.3.1 Dierential Methods . . . . . . . . . . . . . . . . . . 15 2.3.2 Region Based Methods . . . . . . . . . . . . . . . . . 17 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3 Proposed Robust Optical Flow System 21 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 How to Represent Error in Each Pixel . . . . . . . . 23 3.2.2 How to Build the Cost Function . . . . . . . . . . . . 25 3.3 Introduction of Proposed Algorithm . . . . . . . . . . . . . . 26 3.3.1 Original Algorithm of Optical Flow . . . . . . . . . . 27 3.3.2 Optimization of Algorithm for Hardware Friendly . . 29 3.4 Parameters Decision . . . . . . . . . . . . . . . . . . . . . . 33 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4 Architecture Design 37 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.1 Hardware Specication . . . . . . . . . . . . . . . . . 38 4.1.2 Related Works . . . . . . . . . . . . . . . . . . . . . 39 4.2 Direct Implementation . . . . . . . . . . . . . . . . . . . . . 40 4.2.1 Algorithm Expansion for Hardware Design . . . . . . 40 4.2.2 Data Flow and Data Reuse Scheme . . . . . . . . . . 44 4.2.3 Overall System . . . . . . . . . . . . . . . . . . . . . 47 4.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.1 Computing Reuse . . . . . . . . . . . . . . . . . . . . 48 4.3.2 LUT-Quantization . . . . . . . . . . . . . . . . . . . 50 4.3.3 Pipeline Architecture . . . . . . . . . . . . . . . . . . 51 4.3.4 Bit Truncation . . . . . . . . . . . . . . . . . . . . . 53 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5 Conclusion 59 Bibliography 61 Supplement 66
dc.language.iso	en
dc.title	高幀率光流演算法引擎硬體架構設計	zh_TW
dc.title	Architecture Design of High Frame Rate Optical Flow Engine	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	吳安宇,簡韶逸,劉宗德
dc.subject.keyword	電腦視覺處理,光流運算,高幀率架構設計,有效率面積優化架構設計,	zh_TW
dc.subject.keyword	Video processing,optical flow calculation,high frame rate architecture,area efficient architecture,	en
dc.relation.page	75
dc.identifier.doi	10.6342/NTU201600712
dc.rights.note	未授權
dc.date.accepted	2016-07-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
Appears in Collections:	電子工程學研究所

Files in This Item:

File	Size	Format
ntu-105-1.pdf Restricted Access	5.12 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets