請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20966完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 劉長遠 | |
| dc.contributor.author | Han-Ting Yeh | en |
| dc.contributor.author | 葉翰挺 | zh_TW |
| dc.date.accessioned | 2021-06-08T03:12:52Z | - |
| dc.date.copyright | 2017-02-17 | |
| dc.date.issued | 2017 | |
| dc.date.submitted | 2017-02-15 | |
| dc.identifier.citation | [1] Jain, A.K., Murty, M.N., Flynn, P.J. “Data clustering: a review”. ACM Computing
Surveys 31(3), 264–323 ,1999 [2] D.E. Rumelhart, G.E. Hinton, and R.J. Williams. “Learning internal representations by error propagation”. In Parallel Distributed Processing. Vol 1: Foundations. MIT Press, Cambridge, MA, 1986. [3] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, november 1998. [4] Y. Bengio; A. Courville; P. Vincent. “Representation Learning: A Review and New Perspectives”. IEEE Trans. PAMI, special issue Learning Deep Architectures. 2013 [5] G.E. Hinton and R.R. Salakhutdinov. “Reducing the Dimensionality of Data with Neural Networks”. Science (New York, N.Y.), 313:504–507, 2006. [6] P. J. Werbos. “Beyond Regression: New Tools for Prediction and Analysis in the Behavioral”. PhD thesis, Harvard University, 1974. [7] David E. Rumelhart, James L. McClelland, CORPORATE PDP Research Group. “Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations”. MIT Press Cambridge, 1986 [8] Geoffrey E. Hinton.“Training products of experts by minimizing contrastive divergence”. Neural Computation, 14(8):1771–1800, 2002 [9] Andrew Y. Ng ,'Sparse autoencoder,' class notes for CS294A, Computer Science Department , Stanford University, 2011. [10] G.E. Hinton, S. Osindero, and Y.W. Teh. “A fast learning algorithm for deep belief nets”. Neural Computation, 18(7):1527 1554, 2006 [11] Chunfeng Song, Feng Liu, Yongzhen Huang, Liang Wang, and Tieniu Tan, “Auto-encoder Based Data Clustering”, 18th Iberoamerican Congress on Pattern Recognition, (CIAPR oral), 2013. [12] Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S. “Constrained k-means clustering with background knowledge”. In: International Conference on Machine Learning, pp. 577–584 , 2001 [13] Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean and Andrew Y. Ng. “Building High-Level Features using Large Scale Unsupervised Learning”. In Proceedings of the Twenty-Ninth International Conference on Machine Learning, 2012. [14] Cheng-Yuan Liou, Hsin-Chang Yang. “Handprinted character recognition based on spatial topology distance measurement”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.18, no.9, pages 941-945. SCI&EI, 1996 [15] D.J. Burr, ”Elastic Matching of Line Drawings,” IEEE Trans. Pattern Analysis and Machine Intelligence”, vol. 3, pp. 708-713, Nov. 1981 [16] T. Kohonen, “Self-organization and Associative Memory”. 3rd edition. Berlin, Heidelberg, Germany: Springer-Verlag, 1989 [17] Daw-Ran Liou, Chia-Ching Lin and Cheng-Yuan Liou, “Setting Shape Rules for Handprinted Character Recognition”, ACIIDS, The 4rd Asian Conference on Intelligent Information and Database Systems, March 19-21, LNCS 7197, pp. 245-252, 2012 [18] Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, Yifan Mai, Caroline Suen. “UFLDL Tutorial : Visualizing a Trained Autoencoder”, Stanford University, April 2013 http://deeplearning.stanford.edu/wiki/index.php/Visualizing_a_Trained_Autoencoder | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20966 | - |
| dc.description.abstract | 這篇論文提出了一個以2個模組組合成的手寫數字分群分法,第1個模組是堆疊稀疏自動編碼器,第2個模組是以空間拓撲距離測量為基礎的手寫字辨認器,這2個模組分別屬於無監督式及監督式的方法,重點是本文作者把它們變成了1個全無監督式的分群方法。
與現今流行的深度架構不同,本方法採橫向思維,以較淺但擴充神經元的數目來做,避開深度架構前層2個群沒分開,後層就分不開的問題。 本方法使用在60000個手寫數字的MNIST DataSet上分群結果為77.4%,超過了相關論文的76%。而且有現成的方法能再提升效能。 本方法的優點是一個模組化的設計,輸入的手寫數字通過自動編碼器模組抽取出數字樣本特徵後,再交給手寫字辨認模組做分群,不但組成了全無監督式的分群方法,稍加訓練可以再變成分類器使用。模組化的設計使得功能更彈性靈活,任何新的技術的出現可以用替換模組來改變功能或提升效能。 豐富的應用是另一個優點,本方法不只能用在手寫數字的分群,隨著分群過程中亦衍生出3種應用: (1) 在一堆資料中找出標準樣本,例如標準字形、圖形等。(2) 在圖像中搜尋像標準樣本的東西,例如掃描空拍圖搜尋像標準數字的地形。(3) 使用者定義搜尋,例如以圖搜圖。 此外只要變更訓練的資料庫,它也能應用在別的領域,例如在音頻上的分群及搜尋。 | zh_TW |
| dc.description.abstract | This thesis presents a handwritten digits clustering method consisting of two modules. The first module is a Stacked Sparse Autoencoder, and the second module is a Handprinted Character Recognizer Based on Spatial Topology Distance Measurement. These two modules are unsupervised and supervised method, respectively. The point is that the author of this thesis converted them into a fully unsupervised clustering method.
Different from the presently popular deep structure, this method adopts transverse thinking and chooses to do with shallower structure but expanded number of neurons. The purpose is to avoid the problem in the deep structure that if two clusters are not separated in front layer they will not be separated in rear layers. This method is applied to the MNIST dataset with 60000 handwritten digits and the clustering accuracy is 77.4%, more than 76% of the related paper. Furthermore, there is a ready-made way to improve performance. The advantage of this method is a modular design. The digit template features are first extracted by the autoencoder module from input handwritten digits, and then handed over to the handwritten character recognition module to do the clustering. They not only form the fully unsupervised cluster method, but with a little training it can be transformed into a classifier for use. Modular design makes the function more elastic and flexible. With the emergence of any new technology it can replace the module to change the function or improve performance. Rich application is another advantage. This method not only can be used in handwritten digital clustering, but also derives three kinds of applications from clustering process: (1) Find standard templates in a pile of data, such as standard characters, graphics etc. (2) Search for things like standard template in images, such as scanning satellite images to search for terrains like standard digits. (3) user-defined search, such as search by image. In addition, by only change the training database, it can also be applied in other areas, such as clustering and search in audio. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T03:12:52Z (GMT). No. of bitstreams: 1 ntu-106-R01922149-1.pdf: 6750479 bytes, checksum: 023274c9956516b18c7d6ce67eef402b (MD5) Previous issue date: 2017 | en |
| dc.description.tableofcontents | 口試委員會審定書 #
誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS v LIST OF FIGURES vii LIST OF TABLES ix Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Convolutional Neural Network 3 1.2.1 LeNET-5 5 1.3 Stacked Sparse Autoencoder 7 1.3.1 Autoencoder 7 1.3.2 Related Works 13 1.3.2.1 Autoencoder Based Data Clustering 13 1.3.2.2 Building High-level Features using Large Scale Unsupervised Learning 15 1.4 Handprinted Character Recognition Based on Spatial Topology Distance Measurement 18 Chapter 2 The Method 22 2.1 Design Idea 22 2.2 Architecture 23 2.3 Algorithm 26 Chapter 3 Experiment Results and Analysis 30 3.1 Data Set 30 3.2 Experiment and Clustering Results 31 3.3 Analysis 35 3.3.1 Extracted Basic Features 35 3.3.2 Extracted High Level Features 39 3.3.3 Feature composition 41 3.3.4 Extreme Sparsity using BackPropagation 42 Chapter 4 Application 48 4.1 Searching on Satellite images 48 4.2 User-defined search using BackPropagation 51 Chapter 5 Conclusion and Future Work 55 5.1 Conclusion 55 5.2 Future Work 57 REFERENCE 60 | |
| dc.language.iso | en | |
| dc.subject | 淺度學習 | zh_TW |
| dc.subject | 使用者定義搜尋 | zh_TW |
| dc.subject | 無監督式分群分法 | zh_TW |
| dc.subject | 稀疏自動編碼器 | zh_TW |
| dc.subject | 空間拓樸距離 | zh_TW |
| dc.subject | 手寫數字辨識 | zh_TW |
| dc.subject | Sparse Autoencoder | en |
| dc.subject | User defined search | en |
| dc.subject | Unsupervised Clustering method | en |
| dc.subject | Spatial Topology Distance | en |
| dc.subject | Handwritten Digits Recognition | en |
| dc.subject | Shallow Learning | en |
| dc.title | 一個以自動編碼為基礎及無監督式的手寫數字分群方法 | zh_TW |
| dc.title | An Autoencoder-Based and Unsupervised Method for Handwritten Digits Clustering | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 105-1 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 呂育道,周承復 | |
| dc.subject.keyword | 無監督式分群分法,使用者定義搜尋,稀疏自動編碼器,淺度學習,手寫數字辨識,空間拓樸距離, | zh_TW |
| dc.subject.keyword | Unsupervised Clustering method,User defined search,Sparse Autoencoder,Shallow Learning,Handwritten Digits Recognition,Spatial Topology Distance, | en |
| dc.relation.page | 61 | |
| dc.identifier.doi | 10.6342/NTU201700622 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2017-02-15 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-106-1.pdf 未授權公開取用 | 6.59 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
