Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95986
Title: 語義對齊與特徵解離於廣義零樣本動作識別
SMARTEN: Semantic Alignment Through Feature Disentanglement For Generalized Zero-Shot Skeleton-Based Action Recognition
Authors: 李勝維
Sheng-Wei Li
Advisor: 許永真
Jane Yung-jen Hsu
Keyword: 零樣本學習,語義對齊,特徵解耦,基於骨架之動作識別,
Zero-Shot Learning,Semantic Alignment,Feature Disentanglement,Skeleton-based Action Recognition,
Publication Year : 2024
Degree: 碩士
Abstract: 在廣義零樣本基於骨架的動作識別中,現有方法通過特定模態的投影網絡學習骨架特徵和語義嵌入的共享潛在空間。然而,動作識別數據集中,骨架序列因樣本可變而類別標籤為恆定的非對稱性帶來了學習共享潛在空間時的重大挑戰。為了解決這一問題,我們引入了SMARTEN,一種基於對抗學習的特徵解耦方法,從骨架特徵中分離語義相關和無關的潛在變量,以更好地與語義嵌入對齊。利用特定模態的變分自編碼器(VAE)結合交叉重構損失,SMARTEN將語義相關的骨架特徵與語義嵌入對齊。我們的方法在零樣本和廣義零樣本動作識別中設立了新基準,在NTU RGB+D 60、NTU RGB+D 120和FineGym 99等數據集上顯示出顯著的改進。
In generalized zero-shot skeleton-based action recognition, existing approaches learn a shared latent space of skeleton features and semantic embeddings via modality-specific projection networks. However, the asymmetry in action recognition datasets, with variable skeleton sequences but constant class labels, poses significant challenges. Addressing this, we introduce SMARTEN, an adversarial-based feature disentanglement method separating semantic-related and unrelated latents from skeleton features for better alignment with semantic embeddings. Utilizing modality-specific variational autoencoders (VAEs) coupled with cross-reconstruction loss, SMARTEN adeptly aligns semantic-related skeleton features with semantic embeddings. Our approach sets new benchmarks in zero-shot and generalized zero-shot action recognition, demonstrating significant improvements over state-of-the-art methods on benchmark datasets such as NTU RGB+D 60, NTU RGB+D 120, and FineGym 99.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95986
DOI: 10.6342/NTU202401280
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:資訊網路與多媒體研究所

Files in This Item:
File SizeFormat 
ntu-113-1.pdf3.08 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved