Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66127
Title: 專為旋律提取設計的流線型編碼器/解碼器架構
A streamlined encoder/decoder architecture for melody
extraction
Authors: Tsung-Han Hsieh
謝宗翰
Advisor: 李琳山(Lin-shan Lee)
Co-Advisor: 楊奕軒(Yi-Hsuan Yang)
Keyword: 旋律提取,編碼/解碼器,
melody extraction,encoder/decoder,
Publication Year : 2019
Degree: 碩士
Abstract: 在音樂信號處理的領域中,旋律提取一直是很重要的任務。在本論文中,我們提出了一個專為此設計的流線型編碼/解碼器網路模型。我們有兩項技術貢獻。首先,啟發於一個最先進的語意像素分割模型,我們通過向下池化層和向上池化層之間的池化索引來定位旋律頻率。我們用更少的卷機層與更簡單的卷積模塊就可以達到接近最先進水平的結果。第二,我們提出了一種使用神經網路中瓶頸層來預測每ㄧ楨中旋律是否存在的方法,並且使得我們不需要取闕值,可以用簡單的arg-max函數來獲得最終結果。我們的實驗在人聲旋律提取及主旋律旋律提取上,兩者都驗證了模型的有效性。
Melody extraction in polyphonic musical audio is important for music signal processing. In this paper, we propose a novel streamlined encoder/decoder network that is designed for the task. We make two technical contributions. First, drawing inspiration from a state-of-the-art model for semantic pixelwise segmentation, we pass through the pooling indices between pooling and un-pooling layers to localize the melody in frequency. We can achieve result close to the state-of-the-art with much fewer convolutional layers and simpler convolution modules. Second, we propose a way to use the bottleneck layer of the network to estimate the existence of a melody line for each time frame, and make it possible to use a simple argmax function instead of ad-hoc thresholding to get the final estimation of the melody line. Our experiments on both vocal melody extraction and general melody extraction validate the effectiveness of the proposed model.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66127
DOI: 10.6342/NTU202000419
Fulltext Rights: 有償授權
Appears in Collections:資料科學學位學程

Files in This Item:
File SizeFormat 
ntu-108-1.pdf
  Restricted Access
2.07 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved