Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18947
Title: 利用捲積神經網路進行動作辨識
Action Recognition Using Convolutional Neural Network
Authors: Yu-Cheng Liu
劉又誠
Advisor: 丁建均(Jian-Jiun Ding)
Keyword: 動作辨識,深度學習,捲積神經網路,長短時間記憶,三維捲積核心,
action recognition,deep learning,convolutional neural network,long short term memory,3-D convolutional kernel,
Publication Year : 2016
Degree: 碩士
Abstract: 多媒體在人類的生活中扮演重要的角色。有數以萬計的影片被上傳至網路。一些熱門的主題,像是籃球和棒球運動都有著極高的點閱率。因此資料擷取的技術逐漸變得重要。
人類的動作辨識可以被近一步應用於異常事件偵測以及分析人類活動。在我們實驗中所使用到的資料庫裡,有包含像是人類身體的動作以及人類與物品之間的互動,像是跳躍,拍手和飲食。
在這篇論文中,我們先利用捲積神經網路去訓練一個模型。然後擷取訓練及測試用影片的特徵。在取得這些特徵後,我們利用同一個影片中,特徵之間的時間關係去訓練一個三層的長短時間記憶模型。最後,我們選擇長短時間記憶模型的最後一層的最後一個時間步的特徵作為整個測試影片的特徵去分類。我們模型在測試之後的準確率高於一些近幾年來的方法。
Multimedia plays an important role in human daily life. Hundreds of thousands videos are uploaded on the Internet. Some hot topic such as basketball and baseball games are with high click through rate so information retrieval techniques become important.
Human action detection can be further applied to detect abnormal events and analyze activity. In this thesis, the dataset we use in experiments contains the human body action and interaction with objects like jumping, clapping, drinking.
In the thesis, we first uses convolutional neural network (CNN) to train a model. Then extract the features of training and testing data from the model. After obtaining the features, we use the temporal information between features in same video clip to train a 3-layered long short term memory (LSTM) model. Finally, we choose the last layer feature vector of LSTM which contains all data characteristics of the testing video features as the determine scores. The results show that the accuracy of our structure is higher than some works proposed in recent years.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18947
DOI: 10.6342/NTU201602543
Fulltext Rights: 未授權
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-105-1.pdf
  Restricted Access
3.88 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved