用於二元分類的量子自注意力網路

陳書玟; Shu-Wen Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96783

標題:	用於二元分類的量子自注意力網路 Quantum Self Attention Networks for Binary classification
作者:	陳書玟 Shu-Wen Chen
指導教授:	鄭皓中 Hao-Chung Cheng
關鍵字:	量子計算,機器學習,量子機器學習,自注意力機制, Quantum Computation,Machine Learning,Quantum Machine Learning,Self-Attention Mechanism,
出版年 :	2025
學位:	碩士
摘要:	在近年機器學習領域，自注意力機制在發掘數據關聯性和提取重要訊息方面皆有極佳表現。在提供良好計算效果的同時，注意力分數能直觀地呈現輸入序列中 Token 間的關聯性，提供模型決策的可解釋性。使其在自然語言處理、圖像辨識和語音識別等多種機器學習任務中被廣泛應用。由於量子電腦的希爾伯特空間與自注意力機制中的高維內積計算契合，適合作為其運算平台。因此本研究提出一種結合古典自注意力機制與量子特性的模型架構，利用量子電路調控權重矩陣，生成疊加態的查詢向量(Query Vector)與鍵向量(Key Vector)，並在希爾伯特空間中計算內積以量化 Token 間的關聯性生成注意力分數。該分數衡量輸入資訊的重要性，進一步加權值向量(Value Vector)完成注意力機制，能有效捕捉局部與全局聯繫。由於權重矩陣的維度不再與嵌入向量的維度相關，而是取決於量子電路中的可訓練參數，該架構在降低模型參數數量的同時保留了注意力機制的核心優勢。為驗證模型效果，我們將其應用於Pima Indians Diabetes 二元分類任務和 Netflix 電影推薦系統。實驗結果顯示，該模型在參數降低的同時性能可媲美古典演算法，並且在注意力權重矩陣中展現出兩種編碼方式對於 Token 間的關聯性的不同表現，為模型提供了可解釋性，能幫助理解每個輸入 Token 對模型決策的重要程度，有利於資料分析。 In recent years, self-attention mechanisms have demonstrated exceptional performance in uncovering data relationships and extracting critical information within the field of machine learning. By delivering strong computational results, attention scores intuitively represent the relationships between tokens in the input sequence, providing interpretability for model decisions. This makes self-attention widely applicable in various machine learning tasks, such as natural language processing, image recognition, and speech recognition. Given the compatibility between the Hilbert space of quantum computers and the high-dimensional inner product calculations in self-attention mechanisms, quantum computing offers a suitable platform for such computations. We propose a hybrid model architecture that integrates classical self-attention mechanisms with quantum properties. Leveraging quantum circuits to adjust weight matrices, the model generates superposition states of Query Vectors and Key Vectors. It computes their inner products in Hilbert space to quantify the relationships between tokens and generate attention scores. These scores measure the importance of input information, further weighting the Value Vectors to complete the attention mechanism, effectively capturing both local and global dependencies. As the dimensions of the weight matrices are no longer tied to the dimensions of the embedding vectors but instead depend on the trainable parameters within the quantum circuits, this architecture reduces the number of model parameters while retaining the core advantages of the attention mechanism. To evaluate the performance of the proposed model, it was applied to the binary classification task of the Pima Indians Diabetes dataset and the Netflix movie recommendation system. Experimental results indicate that the model achieves performance comparable to classical algorithms while significantly reducing the number of parameters. Additionally, the attention weight matrix reveals different patterns between the two encoding methods in capturing the relationships between tokens, providing interpretability for the model. This helps in understanding the importance of each input token in the model’s decision-making process, which is beneficial for data analysis.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96783
DOI:	10.6342/NTU202500080
全文授權:	未授權
電子全文公開日期:	N/A
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf 未授權公開取用	2.39 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。