Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62249
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳炳宇
dc.contributor.authorSzu-Chiao Haoen
dc.contributor.author郝思喬zh_TW
dc.date.accessioned2021-06-16T13:36:25Z-
dc.date.available2025-09-22
dc.date.copyright2020-09-23
dc.date.issued2020
dc.date.submitted2020-06-16
dc.identifier.citation[1] M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, O. P. Abbeel, and W. Zaremba. Hindsight experience replay. In Advances in Neural Information Processing Systems, pages 5048–5058, 2017.
[2] Y. Bai and C. K. Liu. Coupling cloth and rigid bodies for dexterous manipulation. In Proceedings of the Seventh International Conference on Motion in Games, pages 139–145. ACM, 2014.
[3] Y. Bai and C. K. Liu. Dexterous manipulation using both palm and fingers. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1560–1565. IEEE, 2014.
[4] Y. Bai, W. Yu, and C. K. Liu. Dexterous manipulation of cloth. In Computer Graphics Forum, volume 35, pages 523–532. Wiley Online Library, 2016.
[5] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
[6] A. Clegg, W. Yu, J. Tan, C. K. Liu, and G. Turk. Learning to dress: Synthesizing human dressing motion via deep reinforcement learning. ACM Trans. Graph., 37(6):179:1–179:10, Dec. 2018.
[7] S. Coros, P. Beaudoin, and M. Van de Panne. Generalized biped walking control. In ACM Transactions on Graphics (TOG), volume 29, page 130. ACM, 2010.
[8] T. Geijtenbeek and N. Pronost. Interactive character animation using simulated physics: A state-of-the-art review. Comput. Graph. Forum, 31(8):2492–2515, Dec. 2012.
[9] T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290, 2018.
[10] J. K. Hodgins, W. L. Wooten, D. C. Brogan, and J. F. O’Brien. Animating human athletics. Technical report, Georgia Institute of Technology, 1995.
[11] V. R. Konda and J. N. Tsitsiklis. Actor-critic algorithms. In Advances in neural information processing systems, pages 1008–1014, 2000.
[12] J. Lee and K. H. Lee. Precomputing avatar behavior from human motion data. Graphical Models, 68(2):158–174, 2006.
[13] S. Levine and P. Abbeel. Learning neural network policies with guided policy search under unknown dynamics. In Advances in Neural Information Processing Systems, pages 1071–1079, 2014.
[14] S. Levine and V. Koltun. Learning complex neural network policies with trajectory optimization. In International Conference on Machine Learning, pages 829–837, 2014.
[15] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
[16] L. Liu and J. Hodgins. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning. ACM Transactions on Graphics (TOG), 37(4):142, 2018.
[17] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937, 2016.
[18] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[19] J. Park, J.-Y. Lee, D. Yoo, and I. So Kweon. Distort-and-recover: Color enhancement using deep reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5928–5936, 2018.
[20] X. B. Peng, G. Berseth, and M. van de Panne. Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (Proc. SIGGRAPH 2016), 35(4), 2016.
[21] X. B. Peng, G. Berseth, K. Yin, and M. Van De Panne. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans. Graph., 36(4):41:1–41:13, July 2017.
[22] A. Rajeswaran, V. Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations. In Proceedings of Robotics: Science and Systems (RSS), 2018.
[23] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016.
[24] R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pages 1057–1063, 2000.
[25] J. Tan, Y. Gu, C. K. Liu, and G. Turk. Learning bicycle stunts. ACM Trans. Graph., 33(4):50:1–50:12, July 2014.
[26] J. Tan, K. Liu, and G. Turk. Stable proportional-derivative controllers. IEEE Computer Graphics and Applications, 31(4):34–44, 2011.
[27] K. Yin, S. Coros, P. Beaudoin, and M. van de Panne. Continuation methods for adapting simulated skills. In ACM Transactions on Graphics (TOG), volume 27, page 81. ACM, 2008.
[28] Y. Yonebayashi, H. Kameoka, and S. Sagayama. Automatic decision of piano fingering based on a hidden markov models. In IJCAI, pages 2915–2921, 2007.
[29] K. Yu, C. Dong, L. Lin, and C. C. Loy. Crafting a toolchain for image restoration by deep reinforcement learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 2443–2452, 2018.
[30] W. Yu, G. Turk, and C. K. Liu. Learning symmetric and low-energy locomotion. ACM Trans. Graph., 37(4):144:1–144:12, July 2018.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62249-
dc.description.abstract現今因為深度學習技巧的出現,使得人類的動作如走路、穿衣服、打籃球等等能夠利用深度學習快速歸納出一個規則。鋼琴是世界上最著名的樂器之一,但是彈鋼琴的技巧即使對人來說也是非常難的一件事。彈奏鋼琴包含閱譜、手指彈奏、指法順序等等困難的技巧,非常難歸納整理成一個原則。因此,使用強化學習技巧學會鋼琴技巧非常具有挑戰性。
此篇論文主旨在於如何將一個已知可以使用深度強化學習方式來訓練的環境一步驟一步驟轉換成使用單隻手掌來彈鋼琴。論文中提出了三項重要的技巧:如何轉換一個可以運行的環境到我們想要的環境;如何將鋼琴的指法轉換成為手指在鋼琴中的位置;如何將鋼琴琴鍵的資訊和手指位置資訊結合讓深度強化學習的網路來學習。
最終結果成功讓手指能夠藉由指法的輔助按下鋼琴上的單音。後續針對結果進行彈黑色琴鍵以及彈白色琴鍵的準確性分析、限制分析以及後續可能性猜測。
zh_TW
dc.description.abstractRecently the appearance of the skill, deep learning, makes the conclusion of human motions such as walking, wearing clothes and playing basketball come true. Piano is one of the popular instruments, and skills of playing piano is very hard even for a human. It contains reading piano sheet, playing fingers, scheduling fingering and other difficult techniques to play piano, hence these skills are hard to conclude into a rule. In conclusion, using reinforcement learning to learn piano skills can be a contribution to the field of reinforcement learning.
The point of this thesis is to transform a workable task using deep reinforcement learning into the task of playing single note using a robot's palm step by step. In this thesis, we show three important methods: how to transform a workable task into our desiring task; how to understand piano fingering and change that into the positions of fingers; how to combine the information of piano keys and information of finger positions for the network agent of deep reinforcement learning to learn.
In the end, we successfully make the fingers be able to press down the correct note by the given fingering.
We also analyze the accuracy of pressing black piano keys and pressing white piano keys, and generalize the limitations of our methods.According to our work, we conclude some discussion of future works.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T13:36:25Z (GMT). No. of bitstreams: 1
ntu-109-R06725005-1.pdf: 12829037 bytes, checksum: 7fe56d2adb9c648f1197473d2885ed3a (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents口試委員會審定書 i
致謝 ii
摘要 iii
Abstract iv
List of Figures viii
List of Tables x
Chapter 1 Introduction 1
Chapter 2 Related works 3
2.1 Physically-based Character Control 3
2.2 Deep Reinforcement Learning 4
2.3 Piano Fingering 4
Chapter 3 Overview 6
Chapter 4 Method 8
4.1 Piano Environment 8
4.1.1 Hand Setting 8
4.1.2 Piano Setting 9
4.1.3 Stable Proportional-derivation Controller 10
4.2 Scale Music Generator 11
4.3 Fingering-Position Method 13
4.4 Reinforcement Learning - Soft Actor Critic 15
4.5 Task Simplification 18
4.6 Hindsight Experience Replay 20
Chapter 5 Result 22
5.1 States and Reward Setting 22
5.2 Visualization and Result 24
Chapter 6 Conclusion and Future works 31
Bibliography 34
Appendix 37
Chapter A Experiment Flow 1
A HandReach Extension Experiment 1
A.1 Experiment Setting 2
A.2 Result 4
B HandReach + Piano Keys Position Experiment 5
B.1 Experiment Setting 5
B.2 Result 6
C PianoPlay Key Volume Reward Experiment 11
C.1 Experiment Setting 11
C.2 Result 12
D PianoPlay Key Heights Experiment 16
D.1 Experiment Setting 16
D.2 Result 19
dc.language.isoen
dc.subject鋼琴指法zh_TW
dc.subject機器學習zh_TW
dc.subject強化學習zh_TW
dc.subjectreinforcement learningen
dc.subjectpiano fingeringen
dc.subjectmachine learningen
dc.title使用深度強化學習技術學習鋼琴技巧zh_TW
dc.titleLearning Piano Skills Using Deep Reinforcement Learningen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee朱宏國,王昱舜
dc.subject.keyword機器學習,強化學習,鋼琴指法,zh_TW
dc.subject.keywordmachine learning,reinforcement learning,piano fingering,en
dc.relation.page66
dc.identifier.doi10.6342/NTU201900959
dc.rights.note有償授權
dc.date.accepted2020-06-16
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-109-1.pdf
  未授權公開取用
12.53 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved