Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98201
標題: 基於多維特徵融合之繁體中文手寫相似偏旁辨識方法
Script Recognition for Traditional Chinese Characters with Similar Radicals Based on Multi‑Dimensional Feature Fusion
作者: 姚羽倢
Yu-Jie Yao
指導教授: 丁建均
JIAN-JIUN DING
關鍵字: 傳統手寫字辨識,相似偏旁辨識,多維特徵融合,特徵選擇,少樣本學習條件,低解析度影像辨識,
Traditional Handwritten Character Recognition,Similar Radical Recognition,Multi-dimensional Feature Fusion,Recursive Feature Elimination with Cross-Validation,Limited Sample Availability,Low-resolution Image Recognition,
出版年 : 2025
學位: 碩士
摘要: 隨著電腦視覺技術的快速發展,中文手寫文字辨識已成為重要且具挑戰性的研究課題,然而,繁體中文手寫字的書寫風格差異大,尤其包含相似偏旁的文字,往往因為字形結構和筆劃特徵極為相近,造成辨識上的困難,過去的研究多使用人工設計的特徵提取方法,例如紋理、梯度方向、形狀及局部結構等特徵,近年來亦有研究轉向使用深度學習方法,藉由多層神經網路自動擷取影像的局部和全域資訊。然而,無論是傳統特徵或深度學習方法,若僅使用單一或少數特徵皆難以完整呈現繁體中文手寫文字豐富且細微的筆劃差異。因此,我們提出一種基於多維特徵融合的方法,以提升繁體中文手寫相似偏旁文字的辨識效能,並解決資料量有限導致深度學習不易實施的問題。
在方法設計上,我們首先將每個偏旁資料集進行子區域切割,這是基於觀察到繁體中文字元的不同區域存在明顯的結構差異性,例如偏旁內部的局部筆畫形態及相鄰部件間的空間關係。為了更細緻地擷取這些局部差異,我們設計了多種預定義的區域切割策略,將原始字元圖像劃分為不同的語意子區域,以便能夠更精細且更具區別性地提取影像特徵。
接著,我們在切割完成後的子區域影像上,系統性地提取了多種傳統影像特徵,以提供豐富且多維度的特徵描述,具體包括:描述影像形狀不變性不變矩;描述影像紋理與方向性的梯度方向直方圖與局部二值模式;描述影像灰階紋理變化的灰度共生矩陣;使用筆劃粗細和局部筆劃比例兩種方法以捕捉不同字元間分布差異以及利用細線化處理後的影像提取與筆劃結構直接相關的特徵,藉此捕捉更深入的筆畫結構資訊,達到強化區別力的目的。
完成這些特徵的提取之後,我們將所有不同維度的特徵資訊進行串接與整合,然而,由於特徵維度龐大且可能存在冗餘問題,我們進一步透過遞迴式特徵消除搭配交叉驗證,以自動篩選出最具辨識力的特徵子集合。此過程不僅能降低特徵維度,提升分類模型的計算效率,同時亦能有效去除冗餘或干擾性資訊,使模型在進行分類時能更專注於關鍵且鑑別性高的特徵。
此外,考量到提取之特徵對於不同的文字類別可能具備不同的重要性,我們透過消融實驗,逐步移除各類特徵,觀察並量化每個特徵對於辨識效能的實際貢獻與重要性,進一步驗證特徵篩選結果的合理性與有效性。
最後,我們採用支持向量機作為最終的分類器,以處理特徵融合後的高維資料,並將所提出的方法與現有基於深度學習的辨識模型進行比較分析,證實我們的方法能夠在樣本數量有限的情境下,達到更穩定且具競爭力的辨識效果。
With the rapid development of computer vision techniques, traditional Chinese handwritten character recognition has become an important yet challenging research topic. However, traditional Chinese characters exhibit large variations in handwriting styles, especially those containing visually similar radicals, whose structural and stroke-level similarities often lead to recognition difficulties. Previous studies mostly employed manually designed features, such as texture, gradient direction, shape, and local structures, for character description. More recently, deep learning approaches utilizing multi-layer neural networks to automatically capture local and global image features have emerged. Nevertheless, both conventional methods and deep learning approaches, when relying on only a single or a limited number of features, often fail to fully represent the complex and subtle stroke differences present in traditional handwritten Chinese characters. Therefore, we propose a multi-dimensional feature fusion method to improve the recognition performance of similar-radical handwritten traditional Chinese characters, especially addressing the issue that deep learning methods are often impractical under conditions of limited data availability.
In our method, we first divide each radical dataset into meaningful subregions based on observations that different regions within traditional Chinese characters exhibit significant structural variations, such as local stroke morphology within radicals and spatial relationships between adjacent components. To capture these local differences more effectively, we adopt a set of predefined region segmentation strategies, dividing the original character images into semantically meaningful subregions. This enables more fine-grained and discriminative feature extraction from specific structural areas of the character.
Subsequently, we systematically extract a variety of traditional image features from these segmented subregions to provide a rich and multi-dimensional feature representation. These features include moment invariants that describe shape invariance; Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP), which capture texture and directional information; Gray-Level Co-occurrence Matrix (GLCM), characterizing grayscale texture variations; and Edge Stroke and Localized Stroke Pixel Ratios, which capture stroke thickness and local stroke distribution differences. Additionally, we employ thinning algorithms to derive stroke-structure-related features such as stroke count, junction count, and average stroke length. These features offer deeper insights into the internal stroke structures, significantly enhancing the discriminative power of the extracted features.
After feature extraction, all multi-dimensional features are concatenated into a comprehensive feature vector. However, due to the potentially large and redundant feature dimensions, we further apply Recursive Feature Elimination with Cross-Validation (RFECV) to automatically select the most discriminative feature subset. This process effectively reduces dimensionality, improves computational efficiency, and eliminates redundant or noisy information, enabling the classification model to focus on the most discriminative and relevant features.
Furthermore, considering that extracted features may exhibit varying importance across different character classes, we conduct ablation studies to quantitatively assess the contribution and significance of each feature category, validating the effectiveness and rationality of our feature selection strategy.
Finally, we adopt Support Vector Machine (SVM) as our final classifier for handling the resulting high-dimensional fused features. To evaluate our method comprehensively, we compare our results with current deep-learning-based recognition models. Experimental analyses demonstrate that the proposed method achieves more stable and competitive recognition performance, especially under conditions of limited sample availability, confirming its effectiveness and practical value.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98201
DOI: 10.6342/NTU202502490
全文授權: 未授權
電子全文公開日期: N/A
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
1.8 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved