請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84076完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 管希聖(Hsi-Sheng Goan) | |
| dc.contributor.author | Jia-Hui Lin | en |
| dc.contributor.author | 林家暉 | zh_TW |
| dc.date.accessioned | 2023-03-19T22:04:30Z | - |
| dc.date.copyright | 2022-10-20 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-09-29 | |
| dc.identifier.citation | D. Z. Albert. On quantum-mechanical automata. Physics Letters A, 98(5-6):249–252, 1983. Z. An and D. Zhou. Deep reinforcement learning for quantum gate control. EPL (Europhysics Letters), 126(6):60002, 2019. Z. An, H.-J. Song, Q.-K. He, and D. Zhou. Quantum optimal control of multilevel disspaative quantum systems with reinforcement learning. Physical Review A, 103 (1):012404, 2021. C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio. Attention-based models for speech recognition. Advances in neural information processing systems, 28, 2015. P.-W. Chou, D. Maturana, and S. Scherer. Improving stochastic policy gradients in continuous control with deep reinforcement learning using the beta distribution. In International conference on machine learning, pages 834–843. PMLR, 2017. J. Clarke and F. K. Wilhelm. Superconducting quantum bits. Nature, 453(7198): 1031–1042, 2008. R. C. Deo. Machine learning in medicine. Circulation, 132(20):1920–1930, 2015. D. Deutsch. Quantum theory, the church–turing principle and the universal quantum computer. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 400(1818):97–117, 1985. D. E. Deutsch. Quantum computational networks. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 425(1868):73–90, 1989. I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT press, 2016. E. Greensmith, P. L. Bartlett, and J. Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(9), 2004. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018. A. Hentschel and B. C. Sanders. Machine learning for precise quantum measurement. Physical review letters, 104(6):063603, 2010. G. Hinton, N. Srivastava, and K. Swersky. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8):2, 2012. K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989. C.-H. Huang, C.-H. Yang, C.-C. Chen, A. S. Dzurak, and H.-S. Goan. High-fidelity and robust two-qubit gates for quantum-dot spin qubits in silicon. Physical Review A, 99(4):042310, 2019. N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbruggen, and S. J. Glaser. Optimal ¨ control of coupled spin dynamics: design of nmr pulse sequences by gradient ascent algorithms. Journal of magnetic resonance, 172(2):296–305, 2005. D. Kielpinski, C. Monroe, and D. J. Wineland. Architecture for a large-scale ion-trap quantum computer. Nature, 417(6890):709–711, 2002. H. Kimura, S. Kobayashi, et al. An analysis of actor-critic algorithms using eligibility traces: reinforcement learning with imperfect value functions. Journal of Japanese Society for Artificial Intelligence, 15(2):267–275, 2000. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. J. Kober, J. A. Bagnell, and J. Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274, 2013. V. Konda and J. Tsitsiklis. Actor-critic algorithms. Advances in neural information processing systems, 12, 1999. M. Larocca, P. M. Poggi, and D. A. Wisniacki. Quantum control landscape for a two-level system near the quantum speed limit. Journal of Physics A: Mathematical and Theoretical, 51(38):385305, 2018. D. Loss and D. P. DiVincenzo. Quantum computation with quantum dots. Physical Review A, 57(1):120, 1998. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015. V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Icml, 2010. M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven. Universal quantum control through deep reinforcement learning. npj Quantum Information, 5(1):1–8, 2019. J. Patel, S. Shah, P. Thakkar, and K. Kotecha. Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert systems with applications, 42(1):259–268, 2015. N. Qian. On the momentum term in gradient descent learning algorithms. Neural networks, 12(1):145–151, 1999. S. Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016. D. E. Rumelhart and D. Zipser. Feature discovery by competitive learning. Cognitive science, 9(1):75–112, 1985. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. Trust region policy optimization. In International conference on machine learning, pages 1889–1897. PMLR, 2015. J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016. R. Sutton. Two problems with back propagation and other steepest descent learning procedures for networks. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, 1986, pages 823–832, 1986. R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018. M. Veldhorst, C. Yang, J. Hwang, W. Huang, J. Dehollain, J. Muhonen, S. Simmons, A. Laucht, F. Hudson, K. M. Itoh, et al. A two-qubit logic gate in silicon. Nature, 526(7573):410–414, 2015. A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience, 2018, 2018. C. J. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279–292, 1992. R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3):229–256, 1992. X.-M. Zhang, Z. Wei, R. Asad, X.-C. Yang, and X. Wang. When does reinforcement learning stand out in quantum control? A comparative study on state preparation. npj Quantum Information, 5(1):1–7, 2019. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84076 | - |
| dc.description.abstract | 實現高精度和穩固的量子閘是在具有雜訊的中等規模量子設備中實現可靠的電路深度和達到具有糾錯的容錯量子計算之先決條件,由於量子閘的糾錯操作不應引入過多雜訊。近來,新興的強化學習技術結合深度學習,在最佳控制方面表現出巨大的前景。量子控制問題可以映射到強化學習框架中,以產生控制策略並發現我們所不知道的新物理機制並將其利用。在本文中,我們使用深度神經網路以及採用一種廣泛應用於控制問題的強化學習演算法,稱為近似策略最佳化,以實現高保真度量子閘。與傳統的最佳控制方法相比,此種機器學習方法利用智能體在迭代學習過程中實現精確、高效的量子閘控制。本論文試圖為利用深度強化學習技術研究大型抗噪聲量子計算鋪平道路。 | zh_TW |
| dc.description.abstract | Implementing quantum gates with high precision and robustness is a prerequisite for realizing reliable circuit depth in the noisy intermediate-scale quantum (NISQ) devices and achieving error-corrected fault-tolerant quantum computation, as the error correction operations by quantum gates should not introduce more errors. Recently, the emerging reinforcement learning (RL) techniques combined with deep learning (DL) have shown great promise in control optimization. Quantum control problems can be mapped into a reinforcement learning framework that allows the generation of a control policy by the discovery and exploitation of new physical mechanism that we are not aware of. In this thesis, we utilize deep neural networks and adopt a reinforcement learning algorithm widely, applied on control problems and named proximal policy optimization (PPO), to implement the high-fidelity quantum gates. In contrast to traditional optimal control methods, this machine learning (ML) approach enables an intelligence agent to realize precise and efficient quantum gate control during the iterative learning process. This thesis attempts to pave the way to investigate the large-scale noise-resistant quantum computation with deep reinforcement learning techniques. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T22:04:30Z (GMT). No. of bitstreams: 1 U0001-1012202123251300.pdf: 7244518 bytes, checksum: 861521ea11d35ad840278b33a669505a (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | 摘要..........I Abstract..........II List of Figures..........VI List of Tables..........IX 1 Introduction..........1 2 Quantum computing..........4 2.1 Introduction to quantum information..........4 2.2 Classical and quantum bits..........5 2.3 Classical and quantum logic gates..........8 2.3.1 Single-qubit gates..........9 2.3.2 Controlled gates..........10 2.4 Quantum control..........10 3 Deep learning..........12 3.1 Machine learning..........12 3.2 Deep learning and neural networks..........13 3.2.1 Artificial neural networks..........13 3.2.2 Feed-forward neural networks..........15 3.2.3 Loss function and gradient descent..........16 3.2.4 Backpropagation..........18 3.2.5 Activation functions..........19 3.3 Gradient descent tricks..........22 3.3.1 Stochastic gradient descent..........22 3.3.2 Gradient descent optimization algorithms..........25 3.4 Universal approximation theorem..........29 4 Reinforcement learning..........31 4.1 The fundamental concepts of reinforcement learning..........31 4.1.1 Markov decision process..........31 4.1.2 Value function..........34 4.1.3 Bellman equation..........35 4.1.4 Advantage function..........37 4.2 Estimations of value function and advantage function..........38 4.2.1 Monte Carlo method..........39 4.2.2 Temporal difference method..........39 4.2.3 The λ-return method..........40 4.2.4 Generalized advantage estimation..........41 4.3 Policy optimization..........44 4.3.1 Policy gradient..........44 4.3.2 Baseline in policy gradient..........46 4.3.3 Actor-critic method..........49 4.3.4 On-policy and off-policy learning..........50 4.4 Proximal policy optimization algorithm..........52 4.4.1 Importance sampling..........52 4.4.2 Proximal policy optimization..........53 4.5 Deep reinforcement learning..........57 5 Simulation results of quantum gates via deep reinforcement learning..........58 5.1 Simulation of quantum gates..........58 5.1.1 Piecewise constant control approach..........58 5.1.2 Reinforcement learning modeling..........59 5.2 Numerical simulation results..........63 5.2.1 X-gate..........63 5.2.2 Hadamard gate..........65 5.2.3 CNOT gate..........68 5.2.4 Silicon quantum dots..........71 5.3 Discussion..........77 6 Conclusion..........80 A Exponential weighted moving average..........82 A.1 EWMA formula..........82 A.2 Bias correction..........83 Bibliography..........85 | |
| dc.language.iso | en | |
| dc.subject | 量子閘控制 | zh_TW |
| dc.subject | 強化學習 | zh_TW |
| dc.subject | 機器學習 | zh_TW |
| dc.subject | 神經網路 | zh_TW |
| dc.subject | 近似策略最佳化 | zh_TW |
| dc.subject | Neural networks | en |
| dc.subject | Proximal policy optimization | en |
| dc.subject | Reinforcement learning | en |
| dc.subject | Quantum gate control | en |
| dc.subject | Machine learning | en |
| dc.title | 以近似策略最佳化演算法模擬量子閘控制 | zh_TW |
| dc.title | Simulation of Quantum Gate Control via Proximal Policy Optimization Algorithm | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 張慶瑞(Ching-Ray Chang),江介宏(Jie-Hong Roland Jiang),林俊達(Guin-Dar Lin) | |
| dc.subject.keyword | 強化學習,機器學習,神經網路,近似策略最佳化,量子閘控制, | zh_TW |
| dc.subject.keyword | Reinforcement learning,Machine learning,Neural networks,Proximal policy optimization,Quantum gate control, | en |
| dc.relation.page | 89 | |
| dc.identifier.doi | 10.6342/NTU202104530 | |
| dc.rights.note | 同意授權(限校園內公開) | |
| dc.date.accepted | 2022-09-30 | |
| dc.contributor.author-college | 理學院 | zh_TW |
| dc.contributor.author-dept | 物理學研究所 | zh_TW |
| dc.date.embargo-lift | 2022-10-20 | - |
| 顯示於系所單位: | 物理學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1012202123251300.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 7.07 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
