用於二元分類的量子自注意力網路

陳書玟; Shu-Wen Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96783

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	鄭皓中	zh_TW
dc.contributor.advisor	Hao-Chung Cheng	en
dc.contributor.author	陳書玟	zh_TW
dc.contributor.author	Shu-Wen Chen	en
dc.date.accessioned	2025-02-21T16:31:59Z	-
dc.date.available	2025-02-22	-
dc.date.copyright	2025-02-21	-
dc.date.issued	2025	-
dc.date.submitted	2025-01-13	-
dc.identifier.citation	A. Barenco, A. Berthiaume, D. Deutsch, A. Ekert, R. Jozsa, and C. Macchiavello. Stabilization of quantum computations by symmetrization. SIAM Journalon Computing, 26(5):1541–1557, 1997. M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4):043001, 2019. J. Bennett, S. Lanning, et al. The netflix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35. New York, 2007. J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd. Quantum machine learning. Nature, 549(7671):195–202, 2017. L. Breiman. Random forests. Machine learning, 45:5–32, 2001. R. G. Brereton and G. R. Lloyd. Support vector machines for classification and regression. Analyst, 135(2):230–267, 2010. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. H. Buhrman, R. Cleve, J. Watrous, and R. De Wolf. Quantum fingerprinting. Physical review letters, 87(16):167902, 2001. S. Y.-C. Chen, C.-H. H. Yang, J. Qi, P.-Y. Chen, X. Ma, and H.-S. Goan. Variational quantum circuits for deep reinforcement learning. IEEE access, 8:141007–141024, 2020. E. A. Cherrat, I. Kerenidis, N. Mathur, J. Landman, M. Strahm, and Y. Y. Li. Quantum vision transformers. Quantum, 8(arXiv: 2209.08167):1265, 2024. X. Chu, Z. Tian, B. Zhang, X. Wang, and C. Shen. Conditional positional encodings for vision transformers. arXiv preprint arXiv:2102.10882, 2021. I. Cong, S. Choi, and M. D. Lukin. Quantum convolutional neural networks. Nature Physics, 15(12):1273–1278, 2019. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019. F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2021. M. Du, N. Liu, and X. Hu. Techniques for interpretable machine learning. Communications of the ACM, 63(1):68–77, 2019. E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, V. Stojevic, A. G. Green, and S. Severini. Hierarchical quantum classifiers. npj Quantum Information, 4(1):65, 2018. M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998. T. K. Ho. Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282. IEEE, 1995. H.-L. Huang, Y. Du, M. Gong, Y. Zhao, Y. Wu, C. Wang, S. Li, F. Liang, J. Lin, Y. Xu, et al. Experimental quantum generative adversarial networks for image generation. Physical Review Applied, 16(2):024051, 2021. T. Hur, L. Kim, and D. K. Park. Quantum convolutional neural network for classical data classification. Quantum Machine Intelligence, 4(1), Feb. 2022. W.-C. Kang and J. McAuley. Self-attentive sequential recommendation, 2018. I. Kerenidis and A. Prakash. Quantum recommendation systems, 2016. S. Kleene. Representationof events in nerve nets and finite automata. CE Shannon and J. McCarthy, 1951. G. Li, X. Zhao, and X. Wang. Quantum self-attention neural networks for text classification, 2023. P. Li and B. Wang. Quantum neural networks model based on swap test and phase estimation. Neural Networks, 130:152–164, 2020. F. T. Liu, K. M. Ting, and Z.-H. Zhou. Isolation forest. In 2008 eighth ieee international conference on data mining, pages 413–422. IEEE, 2008. S. Lu and S. L. Braunstein. Quantum decision tree classifier. Quantum information processing, 13(3):757–770, 2014. D. Maheshwari, D. Sierra-Sosa, and B. Garcia-Zapirain. Variational quantum classifier for binary classification: Real vs synthetic dataset. IEEE access, 10:3705–3715, 2021. N. Meinshausen and G. Ridgeway. Quantile regression forests. Journal of machine learning research, 7(6), 2006. T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space, 2013. K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii. Quantum circuit learning. Physical Review A, 98(3):032309, 2018. M. Mottonen, J. J. Vartiainen, V. Bergholm, and M. M. Salomaa. Transformation of quantum states using uniformly controlled rotations. arXiv preprint quant-ph/0407010, 2004. J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014. J. Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, 2018. N. J. Prottasha, A. A. Sami, M. Kowsher, S. A. Murad, A. K. Bairagi, M. Masud, and M. Baz. Transfer learning for sentiment analysis using bert based supervised fine-tuning. Sensors, 22(11), 2022. P. Rebentrost, M. Mohseni, and S. Lloyd. Quantum support vector machine for big data classification. Physical Review Letters, 113(13), Sept. 2014. F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6):386, 1958. B. Schölkopf, R. C. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt. Support vector method for novelty detection. Advances in neural information processing systems, 12, 1999. M. Schuld and N. Killoran. Quantum machine learning in feature hilbert spaces. Physical Review Letters, 122(4), Feb. 2019. M. Schuld and F. Petruccione. Machine learning with quantum computers, volume 676. Springer, 2021. P. Shaw, J. Uszkoreit, and A. Vaswani. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155, 2018. J. Shi, R.-X. Zhao, W. Wang, S. Zhang, and X. Li. Qsan: A near-term achievable quantum self-attention network. arXiv preprint arXiv:2207.07563, 2022. P. W. Shor. Scheme for reducing decoherence in quantum computer memory. Physical review A, 52(4):R2493, 1995. A. Steane. Quantum computing. Reports on Progress in Physics, 61(2):117, 1998. E. Stoudenmire and D. J. Schwab. Supervised learning with tensor networks. Advances in neural information processing systems, 29, 2016. C. Strobl, A.-L. Boulesteix, A. Zeileis, and T. Hothorn. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC bioinformatics, 8:1–21, 2007. J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2024. K. Temme, S. Bravyi, and J. M. Gambetta. Error mitigation for short-depth quantum circuits. Physical review letters, 119(18):180509, 2017. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need, 2023. S. Wiegreffe and Y. Pinter. Attention is not not explanation. arXiv preprint arXiv:1908.04626, 2019. H. Zhang, Q. Zhao, and C. Chen. A light-weight quantum self-attention model for classical data classification. Applied Intelligence, 54(4):3077–3091, 2024. R.-X. Zhao, J. Shi, and X. Li. Qksan: A quantum kernel self-attention network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96783	-
dc.description.abstract	在近年機器學習領域，自注意力機制在發掘數據關聯性和提取重要訊息方面皆有極佳表現。在提供良好計算效果的同時，注意力分數能直觀地呈現輸入序列中 Token 間的關聯性，提供模型決策的可解釋性。使其在自然語言處理、圖像辨識和語音識別等多種機器學習任務中被廣泛應用。由於量子電腦的希爾伯特空間與自注意力機制中的高維內積計算契合，適合作為其運算平台。因此本研究提出一種結合古典自注意力機制與量子特性的模型架構，利用量子電路調控權重矩陣，生成疊加態的查詢向量(Query Vector)與鍵向量(Key Vector)，並在希爾伯特空間中計算內積以量化 Token 間的關聯性生成注意力分數。該分數衡量輸入資訊的重要性，進一步加權值向量(Value Vector)完成注意力機制，能有效捕捉局部與全局聯繫。由於權重矩陣的維度不再與嵌入向量的維度相關，而是取決於量子電路中的可訓練參數，該架構在降低模型參數數量的同時保留了注意力機制的核心優勢。為驗證模型效果，我們將其應用於Pima Indians Diabetes 二元分類任務和 Netflix 電影推薦系統。實驗結果顯示，該模型在參數降低的同時性能可媲美古典演算法，並且在注意力權重矩陣中展現出兩種編碼方式對於 Token 間的關聯性的不同表現，為模型提供了可解釋性，能幫助理解每個輸入 Token 對模型決策的重要程度，有利於資料分析。	zh_TW
dc.description.abstract	In recent years, self-attention mechanisms have demonstrated exceptional performance in uncovering data relationships and extracting critical information within the field of machine learning. By delivering strong computational results, attention scores intuitively represent the relationships between tokens in the input sequence, providing interpretability for model decisions. This makes self-attention widely applicable in various machine learning tasks, such as natural language processing, image recognition, and speech recognition. Given the compatibility between the Hilbert space of quantum computers and the high-dimensional inner product calculations in self-attention mechanisms, quantum computing offers a suitable platform for such computations. We propose a hybrid model architecture that integrates classical self-attention mechanisms with quantum properties. Leveraging quantum circuits to adjust weight matrices, the model generates superposition states of Query Vectors and Key Vectors. It computes their inner products in Hilbert space to quantify the relationships between tokens and generate attention scores. These scores measure the importance of input information, further weighting the Value Vectors to complete the attention mechanism, effectively capturing both local and global dependencies. As the dimensions of the weight matrices are no longer tied to the dimensions of the embedding vectors but instead depend on the trainable parameters within the quantum circuits, this architecture reduces the number of model parameters while retaining the core advantages of the attention mechanism. To evaluate the performance of the proposed model, it was applied to the binary classification task of the Pima Indians Diabetes dataset and the Netflix movie recommendation system. Experimental results indicate that the model achieves performance comparable to classical algorithms while significantly reducing the number of parameters. Additionally, the attention weight matrix reveals different patterns between the two encoding methods in capturing the relationships between tokens, providing interpretability for the model. This helps in understanding the importance of each input token in the model’s decision-making process, which is beneficial for data analysis.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-21T16:31:59Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-02-21T16:31:59Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents v List of Figures viii List of Tables x Chapter 1 Introduction 1 Chapter 2 Machine Learning 3 2.1 Machine Learning Algorithms 3 2.1.1 Support Vector Machine 4 2.1.2 Random forest 4 2.2 Artificial Neural Network 5 2.2.1 Neurons and Layers 5 2.2.2 Learning Process 6 2.3 Transformer 7 2.3.1 Attention Mechanism 7 2.3.2 Positional Encoding 10 2.3.3 Classification 11 Chapter 3 Quantum Computation 13 3.1 Quantum state 13 3.2 Quantum Gate 15 3.3 Variational Quantum Circuit 20 3.4 Encoding Methods 21 3.4.1 Angle Encoding 21 3.4.2 Amplitude Encoding 22 3.5 Swap test 23 3.6 Noisy Intermediate-Scale Quantum (NISQ) 24 Chapter 4 Related Works 26 4.1 Classical Self-Attention Neural Networks 26 4.2 Quantum Self-Attention Neural Networks 27 4.3 Quantum classification 29 Chapter 5 Method 30 5.1 Hybrid Self Attention Network 30 5.2 Embedding 32 5.2.1 Angle Encoding method 33 5.2.2 Amplitude Encoding method 34 5.2.3 Positional Encoding 35 5.3 Attention 37 5.3.1 Inner product 37 5.3.2 Scaleing attention 39 Chapter 6 Result and Discussion 41 6.1 Pima Indians Diabetes Experiment 41 6.2 Netflix Prize Dataset Experiment 45 6.3 Disscussion 49 6.3.1 Performance 49 6.3.2 Model Parameters 50 6.3.3 Model Explainability 52 6.3.4 Real Device 57 Chapter 7 Conclusion 59 References 61 Appendix A — Real Device Setting 67 A.1 Tasks on Real device 67	-
dc.language.iso	en	-
dc.subject	量子計算	zh_TW
dc.subject	自注意力機制	zh_TW
dc.subject	量子機器學習	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	Self-Attention Mechanism	en
dc.subject	Quantum Computation	en
dc.subject	Machine Learning	en
dc.subject	Quantum Machine Learning	en
dc.title	用於二元分類的量子自注意力網路	zh_TW
dc.title	Quantum Self Attention Networks for Binary classification	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	管希聖;張慶瑞	zh_TW
dc.contributor.oralexamcommittee	Hsi-Sheng Goan;Ching-Ray Chang	en
dc.subject.keyword	量子計算,機器學習,量子機器學習,自注意力機制,	zh_TW
dc.subject.keyword	Quantum Computation,Machine Learning,Quantum Machine Learning,Self-Attention Mechanism,	en
dc.relation.page	70	-
dc.identifier.doi	10.6342/NTU202500080	-
dc.rights.note	未授權	-
dc.date.accepted	2025-01-14	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf 未授權公開取用	2.39 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。