Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 理學院
  3. 物理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84076
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor管希聖(Hsi-Sheng Goan)
dc.contributor.authorJia-Hui Linen
dc.contributor.author林家暉zh_TW
dc.date.accessioned2023-03-19T22:04:30Z-
dc.date.copyright2022-10-20
dc.date.issued2022
dc.date.submitted2022-09-29
dc.identifier.citationD. Z. Albert. On quantum-mechanical automata. Physics Letters A, 98(5-6):249–252, 1983. Z. An and D. Zhou. Deep reinforcement learning for quantum gate control. EPL (Europhysics Letters), 126(6):60002, 2019. Z. An, H.-J. Song, Q.-K. He, and D. Zhou. Quantum optimal control of multilevel disspaative quantum systems with reinforcement learning. Physical Review A, 103 (1):012404, 2021. C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio. Attention-based models for speech recognition. Advances in neural information processing systems, 28, 2015. P.-W. Chou, D. Maturana, and S. Scherer. Improving stochastic policy gradients in continuous control with deep reinforcement learning using the beta distribution. In International conference on machine learning, pages 834–843. PMLR, 2017. J. Clarke and F. K. Wilhelm. Superconducting quantum bits. Nature, 453(7198): 1031–1042, 2008. R. C. Deo. Machine learning in medicine. Circulation, 132(20):1920–1930, 2015. D. Deutsch. Quantum theory, the church–turing principle and the universal quantum computer. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 400(1818):97–117, 1985. D. E. Deutsch. Quantum computational networks. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 425(1868):73–90, 1989. I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT press, 2016. E. Greensmith, P. L. Bartlett, and J. Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(9), 2004. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018. A. Hentschel and B. C. Sanders. Machine learning for precise quantum measurement. Physical review letters, 104(6):063603, 2010. G. Hinton, N. Srivastava, and K. Swersky. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8):2, 2012. K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989. C.-H. Huang, C.-H. Yang, C.-C. Chen, A. S. Dzurak, and H.-S. Goan. High-fidelity and robust two-qubit gates for quantum-dot spin qubits in silicon. Physical Review A, 99(4):042310, 2019. N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbruggen, and S. J. Glaser. Optimal ¨ control of coupled spin dynamics: design of nmr pulse sequences by gradient ascent algorithms. Journal of magnetic resonance, 172(2):296–305, 2005. D. Kielpinski, C. Monroe, and D. J. Wineland. Architecture for a large-scale ion-trap quantum computer. Nature, 417(6890):709–711, 2002. H. Kimura, S. Kobayashi, et al. An analysis of actor-critic algorithms using eligibility traces: reinforcement learning with imperfect value functions. Journal of Japanese Society for Artificial Intelligence, 15(2):267–275, 2000. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. J. Kober, J. A. Bagnell, and J. Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274, 2013. V. Konda and J. Tsitsiklis. Actor-critic algorithms. Advances in neural information processing systems, 12, 1999. M. Larocca, P. M. Poggi, and D. A. Wisniacki. Quantum control landscape for a two-level system near the quantum speed limit. Journal of Physics A: Mathematical and Theoretical, 51(38):385305, 2018. D. Loss and D. P. DiVincenzo. Quantum computation with quantum dots. Physical Review A, 57(1):120, 1998. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015. V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Icml, 2010. M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven. Universal quantum control through deep reinforcement learning. npj Quantum Information, 5(1):1–8, 2019. J. Patel, S. Shah, P. Thakkar, and K. Kotecha. Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert systems with applications, 42(1):259–268, 2015. N. Qian. On the momentum term in gradient descent learning algorithms. Neural networks, 12(1):145–151, 1999. S. Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016. D. E. Rumelhart and D. Zipser. Feature discovery by competitive learning. Cognitive science, 9(1):75–112, 1985. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. Trust region policy optimization. In International conference on machine learning, pages 1889–1897. PMLR, 2015. J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016. R. Sutton. Two problems with back propagation and other steepest descent learning procedures for networks. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, 1986, pages 823–832, 1986. R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018. M. Veldhorst, C. Yang, J. Hwang, W. Huang, J. Dehollain, J. Muhonen, S. Simmons, A. Laucht, F. Hudson, K. M. Itoh, et al. A two-qubit logic gate in silicon. Nature, 526(7573):410–414, 2015. A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience, 2018, 2018. C. J. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279–292, 1992. R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3):229–256, 1992. X.-M. Zhang, Z. Wei, R. Asad, X.-C. Yang, and X. Wang. When does reinforcement learning stand out in quantum control? A comparative study on state preparation. npj Quantum Information, 5(1):1–7, 2019.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84076-
dc.description.abstract實現高精度和穩固的量子閘是在具有雜訊的中等規模量子設備中實現可靠的電路深度和達到具有糾錯的容錯量子計算之先決條件,由於量子閘的糾錯操作不應引入過多雜訊。近來,新興的強化學習技術結合深度學習,在最佳控制方面表現出巨大的前景。量子控制問題可以映射到強化學習框架中,以產生控制策略並發現我們所不知道的新物理機制並將其利用。在本文中,我們使用深度神經網路以及採用一種廣泛應用於控制問題的強化學習演算法,稱為近似策略最佳化,以實現高保真度量子閘。與傳統的最佳控制方法相比,此種機器學習方法利用智能體在迭代學習過程中實現精確、高效的量子閘控制。本論文試圖為利用深度強化學習技術研究大型抗噪聲量子計算鋪平道路。zh_TW
dc.description.abstractImplementing quantum gates with high precision and robustness is a prerequisite for realizing reliable circuit depth in the noisy intermediate-scale quantum (NISQ) devices and achieving error-corrected fault-tolerant quantum computation, as the error correction operations by quantum gates should not introduce more errors. Recently, the emerging reinforcement learning (RL) techniques combined with deep learning (DL) have shown great promise in control optimization. Quantum control problems can be mapped into a reinforcement learning framework that allows the generation of a control policy by the discovery and exploitation of new physical mechanism that we are not aware of. In this thesis, we utilize deep neural networks and adopt a reinforcement learning algorithm widely, applied on control problems and named proximal policy optimization (PPO), to implement the high-fidelity quantum gates. In contrast to traditional optimal control methods, this machine learning (ML) approach enables an intelligence agent to realize precise and efficient quantum gate control during the iterative learning process. This thesis attempts to pave the way to investigate the large-scale noise-resistant quantum computation with deep reinforcement learning techniques.en
dc.description.provenanceMade available in DSpace on 2023-03-19T22:04:30Z (GMT). No. of bitstreams: 1
U0001-1012202123251300.pdf: 7244518 bytes, checksum: 861521ea11d35ad840278b33a669505a (MD5)
Previous issue date: 2022
en
dc.description.tableofcontents摘要..........I Abstract..........II List of Figures..........VI List of Tables..........IX 1 Introduction..........1 2 Quantum computing..........4 2.1 Introduction to quantum information..........4 2.2 Classical and quantum bits..........5 2.3 Classical and quantum logic gates..........8 2.3.1 Single-qubit gates..........9 2.3.2 Controlled gates..........10 2.4 Quantum control..........10 3 Deep learning..........12 3.1 Machine learning..........12 3.2 Deep learning and neural networks..........13 3.2.1 Artificial neural networks..........13 3.2.2 Feed-forward neural networks..........15 3.2.3 Loss function and gradient descent..........16 3.2.4 Backpropagation..........18 3.2.5 Activation functions..........19 3.3 Gradient descent tricks..........22 3.3.1 Stochastic gradient descent..........22 3.3.2 Gradient descent optimization algorithms..........25 3.4 Universal approximation theorem..........29 4 Reinforcement learning..........31 4.1 The fundamental concepts of reinforcement learning..........31 4.1.1 Markov decision process..........31 4.1.2 Value function..........34 4.1.3 Bellman equation..........35 4.1.4 Advantage function..........37 4.2 Estimations of value function and advantage function..........38 4.2.1 Monte Carlo method..........39 4.2.2 Temporal difference method..........39 4.2.3 The λ-return method..........40 4.2.4 Generalized advantage estimation..........41 4.3 Policy optimization..........44 4.3.1 Policy gradient..........44 4.3.2 Baseline in policy gradient..........46 4.3.3 Actor-critic method..........49 4.3.4 On-policy and off-policy learning..........50 4.4 Proximal policy optimization algorithm..........52 4.4.1 Importance sampling..........52 4.4.2 Proximal policy optimization..........53 4.5 Deep reinforcement learning..........57 5 Simulation results of quantum gates via deep reinforcement learning..........58 5.1 Simulation of quantum gates..........58 5.1.1 Piecewise constant control approach..........58 5.1.2 Reinforcement learning modeling..........59 5.2 Numerical simulation results..........63 5.2.1 X-gate..........63 5.2.2 Hadamard gate..........65 5.2.3 CNOT gate..........68 5.2.4 Silicon quantum dots..........71 5.3 Discussion..........77 6 Conclusion..........80 A Exponential weighted moving average..........82 A.1 EWMA formula..........82 A.2 Bias correction..........83 Bibliography..........85
dc.language.isoen
dc.subject量子閘控制zh_TW
dc.subject強化學習zh_TW
dc.subject機器學習zh_TW
dc.subject神經網路zh_TW
dc.subject近似策略最佳化zh_TW
dc.subjectNeural networksen
dc.subjectProximal policy optimizationen
dc.subjectReinforcement learningen
dc.subjectQuantum gate controlen
dc.subjectMachine learningen
dc.title以近似策略最佳化演算法模擬量子閘控制zh_TW
dc.titleSimulation of Quantum Gate Control via Proximal Policy Optimization Algorithmen
dc.typeThesis
dc.date.schoolyear110-2
dc.description.degree碩士
dc.contributor.oralexamcommittee張慶瑞(Ching-Ray Chang),江介宏(Jie-Hong Roland Jiang),林俊達(Guin-Dar Lin)
dc.subject.keyword強化學習,機器學習,神經網路,近似策略最佳化,量子閘控制,zh_TW
dc.subject.keywordReinforcement learning,Machine learning,Neural networks,Proximal policy optimization,Quantum gate control,en
dc.relation.page89
dc.identifier.doi10.6342/NTU202104530
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2022-09-30
dc.contributor.author-college理學院zh_TW
dc.contributor.author-dept物理學研究所zh_TW
dc.date.embargo-lift2022-10-20-
顯示於系所單位:物理學系

文件中的檔案:
檔案 大小格式 
U0001-1012202123251300.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
7.07 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved