基於多方安全計算實現長序列 Transformer 解碼器模型之高效率安全推論

林奕安; Yi-An Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101545

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	吳沛遠	zh_TW
dc.contributor.advisor	Pei-Yuan Wu	en
dc.contributor.author	林奕安	zh_TW
dc.contributor.author	Yi-An Lin	en
dc.date.accessioned	2026-02-11T16:15:00Z	-
dc.date.available	2026-02-12	-
dc.date.copyright	2026-02-11	-
dc.date.issued	2026	-
dc.date.submitted	2026-02-02	-
dc.identifier.citation	[1] A. Act et al. Health insurance portability and accountability act of 1996. Public law, 104:191, 1996. [2] Y. Akimoto, K. Fukuchi, Y. Akimoto, and J. Sakuma. Privformer: Privacy-preserving transformer with mpc. In 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), pages 392–410. IEEE, 2023. [3] H. Ballhausen, S. Corradini, C. Belka, D. Bogdanov, L. Boldrini, F. Bono, C. Goelz, G. Landry, G. Panza, K. Parodi, et al. Privacy-friendly evaluation of patient data with secure multiparty computation in a european pilot study. npj Digital Medicine, 7(1):280, 2024. [4] O. Catrina and C. Dragulin. Multiparty computation of fixed-point multiplication and reciprocal. In 2009 20th International Workshop on Database and Expert Systems Application, pages 107–111. IEEE, 2009. [5] N. Chandran, D. Gupta, A. Rastogi, R. Sharma, and S. Tripathi. Ezpc: Programmable and efficient secure two-party computation for machine learning. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 496–511. IEEE, 2019. [6] H. Chaudhari, A. Choudhury, A. Patra, and A. Suresh. Astra: High throughput 3pc over rings with application to secure prediction. In Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, pages 81–92, 2019 [7] D. Chen, Y. Zhang, S. Kundu, C. Li, and P. A. Beerel. Rna-vit: reduced-dimension approximate normalized attention vision transformers for latency efficient private inference. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), pages 1–9. IEEE, 2023. [8] T. Chen, H. Bao, S. Huang, L. Dong, B. Jiao, D. Jiang, H. Zhou, J. Li, and F. Wei. The-x: Privacy-preserving transformer inference with homomorphic encryption. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3510–3520, Dublin, Ireland, 2022. Association for Computational Linguistics. [9] DeepSeek. Deepseek. Online. [10] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019. [11] Y. Ding, H. Guo, Y. Guan, W. Liu, J. Huo, Z. Guan, and X. Zhang. East: Efficient and accurate secure inference framework for transformer. IEEE Transactions on Services Computing, 18(4):2038–2046, 2025. [12] Y. Dong, W.-j. Lu, Y. Zheng, H. Wu, D. Zhao, J. Tan, Z. Huang, C. Hong, T. Wei, and W. Chen. Puma: Secure inference of llama-7b in five minutes. arXiv preprint arXiv:2307.12533, 2023. [13] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. [14] M. D. Ercegovac, T. Lang, J.-M. Muller, and A. Tisserand. Reciprocation, square root, inverse square root, and some elementary functions using small multipliers. IEEE Transactions on computers, 49(7):628–637, 2002. [15] Facebook Research. Crypten. GitHub repository. [16] J. D. Faires and R. L. Burden. Numerical Methods. Thomson Brooks/Cole, 3 edition, 2002. [17] C. Gentry. A fully homomorphic encryption scheme. Stanford university, 2009. [18] Google. Google gemini. Online. [19] K. Gupta, N. Jawalkar, A. Mukherjee, N. Chandran, D. Gupta, A. Panwar, and R. Sharma. Sigma: Secure gpt inference with function secret sharing. Proceedings on Privacy Enhancing Technologies, 2024(4):61–79, 2024. [20] M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang. Iron: Private inference on transformers. Advances in neural information processing systems, 35:15718–15731, 2022. [21] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [22] D. Hendrycks. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016. [23] X. Hou, J. Liu, J. Li, Y. Li, W.-j. Lu, C. Hong, and K. Ren. Ciphergpt: Secure two-party gpt inference. Cryptology ePrint Archive, 2023. [24] Z. Huang, W.-j. Lu, C. Hong, and J. Ding. Cheetah: Lean and fast secure two-party deep neural network inference. In 31st USENIX Security Symposium (USENIX Security 22), pages 809–826, 2022. [25] A. Katharopoulos, A. Vyas, N. Pappas, and F. Fleuret. Transformers are rnns. Official project website. [26] A. Katharopoulos, A. Vyas, N. Pappas, and F. Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. In International conference on machine learning, pages 5156–5165. PMLR, 2020. [27] A. Y. L. Kei and S. S. M. Chow. Shaft: Secure, handy, accurate, and fast transformer inference. In Proceedings of the 2025 Network and Distributed System Security (NDSS) Symposium, 2025. [28] M. Keller. Mp-spdz: A versatile framework for multi-party computation. In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security, pages 1575–1590, 2020. [29] B. Knott, S. Venkataraman, A. Hannun, S. Sengupta, M. Ibrahim, and L. van der Maaten. Crypten: Secure multi-party computation meets machine learning. Advances in Neural Information Processing Systems, 34:4961–4973, 2021. [30] N. Kumar, M. Rathee, N. Chandran, D. Gupta, A. Rastogi, and R. Sharma. Cryptflow: Secure tensorflow inference. In 2020 IEEE Symposium on Security and Privacy (SP), pages 336–353. IEEE, 2020. [31] Y. LeCun, C. Cortes, and C. Burges. Mnist handwritten digit database, 2010. Dataset of 60,000 training and 10,000 test 28×28-pixel grayscale handwritten digits. [32] D. Li, R. Shao, H. Wang, H. Guo, E. P. Xing, and H. Zhang. Mpcformer: fast, performant and private transformer inference with mpc. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023), 2023. [33] W. Lu, Z. Huang, Z. Gu, J. Li, J. Liu, C. Hong, K. Ren, T. Wei, and W. Chen. BumbleBee: Secure Two-party Inference Framework for Large Transformers. In 32nd Annual Network and Distributed System Security Symposium, NDSS 2025. The Internet Society, February 23-28, 2025. [34] J. Luo. Secformer. GitHub repository. [35] J. Luo, Y. Zhang, Z. Zhang, J. Zhang, X. Mu, H. Wang, Y. Yu, and Z. Xu. Secformer: fast and accurate privacy-preserving inference for transformer models via smpc. In Findings of the Association for Computational Linguistics ACL 2024, pages 13333–13348, 2024. [36] P. Mishra, R. Lehmkuhl, A. Srinivasan, W. Zheng, and R. A. Popa. Delphi: A cryptographic inference system for neural networks. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice, pages 27–30, 2020. [37] P. Mohassel and P. Rindal. Aby3: A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pages 35–52, 2018. [38] P. Mohassel and Y. Zhang. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE symposium on security and privacy (SP), pages 19–38. IEEE, 2017. [39] OpenAI. Chatgpt. Online. [40] A. Patra, T. Schneider, A. Suresh, and H. Yalame. Aby2.0: Improved mixed-protocol secure two-party computation. In 30th USENIX Security Symposium (USENIX Security 21), pages 2165–2182, 2021. [41] M. O. Rabin. How to exchange secrets with oblivious transfer. Cryptology ePrint Archive, 2005. [42] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners, 2019. [43] D. Rathee, M. Rathee, R. K. K. Goli, D. Gupta, R. Sharma, N. Chandran, and A. Rastogi. Sirnn: A math library for secure rnn inference. In 2021 IEEE Symposium on Security and Privacy (SP), pages 1003–1020. IEEE, 2021. [44] D. Rathee, M. Rathee, N. Kumar, N. Chandran, D. Gupta, A. Rastogi, and R. Sharma. Cryptflow2: Practical 2-party secure inference. In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security, pages 325–342, 2020. [45] P. Regulation. General data protection regulation. Intouch, 25:1–5, 2018. [46] L. Rovida and A. Leporati. Transformer-based language models and homomorphic encryption: An intersection with bert-tiny. In Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics, pages 3–13, 2024. [47] A. Shamir. How to share a secret. Communications of the ACM, 22(11):612–613, 1979. [48] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [49] Y. Sun, L. Dong, Y. Zhu, S. Huang, W. Wang, S. Ma, Q. Zhang, J. Wang, and F. Wei. You only cache once: Decoder-decoder architectures for language models. Advances in Neural Information Processing Systems, 37:7339–7361, 2024. [50] S. Tan, B. Knott, Y. Tian, and D. J. Wu. Cryptgpu: Fast privacy-preserving machine learning on the gpu. In 2021 IEEE Symposium on Security and Privacy (SP), pages 1021–1038. IEEE, 2021. [51] Y. Tay, M. Dehghani, D. Bahri, and D. Metzler. Efficient transformers: A survey. ACM Computing Surveys, 55(6):1–28, 2022. [52] T. Tung-Lin and W. Pei-Yuan. Sepmm: A general matrix multiplication optimization approach for privacy-preserving machine learning. In 2023 IEEE Conference on Dependable and Secure Computing (DSC), pages 1–10. IEEE, 2023. [53] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [54] Y. Wang, G. E. Suh, W. Xiong, B. Lefaudeux, B. Knott, M. Annavaram, and H.-H. S. Lee. Characterization of mpc-based private inference for transformer-based models. In 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 187–197. IEEE, 2022. [55] A. C. Yao. Protocols for secure computations. In 23rd annual symposium on foundations of computer science (sfcs 1982), pages 160–164. IEEE, 1982. [56] A. C.-C. Yao. How to generate and exchange secrets. In 27th annual symposium on foundations of computer science (Sfcs 1986), pages 162–167. IEEE, 1986. [57] J. Zhang, X. Yang, L. He, K. Chen, W.-j. Lu, Y. Wang, X. Hou, J. Liu, K. Ren, and X. Yang. Secure transformer inference made non-interactive. Cryptology ePrint Archive, 2024. [58] Y. Zhang, D. Chen, S. Kundu, C. Li, and P. A. Beerel. Sal-vit: Towards latency efficient private inference on vit using selective attention search with a learnable softmax approximation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5116–5125, 2023.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101545	-
dc.description.abstract	多方安全計算（Secure Multi-Party Computation, SMPC）可在不洩漏資料內容的情況下進行隱私保護推論，然而在 SMPC 設定下，針對 Transformer decoder 模型進行長序列生成的研究仍相當有限且缺乏系統性探討。本文提出 SecLoT，一個旨在提升安全 Transformer decoder 推論效率的框架，透過同時解決演算法層面與系統層面的限制，推進長序列生成在 SMPC 環境中的可行性。我們將固定矩陣大小的 causal mask self-attention 機制引入 SMPC 設定中，使自回歸解碼過程得以在不依賴安全 softmax 的情況下進行，並達成每個生成 token 幾乎固定的計算時間與通訊成本。此外，我們擴充 CrypTen 編譯器以支援更複雜的模型架構，提出無輸入範圍限制且高效率的除法與倒數平方根演算法，並設計一個具正確性保證的截斷協定，以消除既有實作中可能發生的隨機錯誤。在基於 decoder-only Transformer 的 MNIST 圖像生成實驗中，結果顯示 SecLoT 相較於以 softmax 為基礎的方法，能有效降低計算時間與通訊開銷，且此優勢隨著生成序列長度的增加而更加明顯。	zh_TW
dc.description.abstract	Secure Multi-Party Computation (SMPC) enables privacy-preserving inference, yet long-sequence generation with Transformer decoder models has not been systematically explored under SMPC. In this work, we present SecLoT, a framework that advances secure Transformer decoder inference by addressing both algorithmic inefficiencies and system-level limitations. We adapt a fixed-size causal mask self-attention formulation to the SMPC setting, enabling autoregressive decoding without secure softmax and achieving nearly constant computation time and communication cost per generated token. In addition, we extend the CrypTen compiler to support more complex model architectures, propose domain-unrestricted and efficient algorithms for division and inverse square root, and introduce a correct truncation protocol that eliminates probabilistic errors in existing implementations. Experiments on decoder-only Transformer-based MNIST image generation demonstrate that SecLoT significantly reduces computation time and communication overhead compared to softmax-based baselines, particularly as sequence length increases.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2026-02-11T16:15:00Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2026-02-11T16:15:00Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書 i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Chapter 1 Introduction 1 1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Related Works 7 2.1 PPML Developer Friendly Framework . . . . . . . . . . . . . . . . 7 2.2 Secure Transformer Inference . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Nonlinear Operator Replacement . . . . . . . . . . . . . . . . . . . 9 2.2.2 Accurate Approximation of Nonlinear Layers . . . . . . . . . . . . 10 Chapter 3 Preliminaries 15 3.1 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Fixed-point Representation . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Secret Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.1 Arithmetic Secret Sharing . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.2 Binary Secret Sharing . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Truncation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.5 Causal Masked Self-Attention . . . . . . . . . . . . . . . . . . . . . 18 3.6 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.6.1 A2B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.6.2 B2A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6.3 Shift by k bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6.4 AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6.5 XOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6.6 OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.6.7 Less Than Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.6.8 Less Than . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.6.9 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6.10 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6.11 Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6.12 Arithmetic Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6.13 Binary Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.6.14 Decrypt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.6.15 Wrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.6.16 Max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.6.17 Softmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Chapter 4 Methodology 27 4.1 Model Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Wrap Around Truncation . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Reciprocal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.4 Inverse Square Root . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5 Causal Mask Self-attention Protocol . . . . . . . . . . . . . . . . . . 34 Chapter 5 Experiments 41 5.1 System Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2 Truncation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.3 Division speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.4 Inverse Square Root Speedup . . . . . . . . . . . . . . . . . . . . . 45 5.5 Transformer Decoder model speedup . . . . . . . . . . . . . . . . . 47 Chapter 6 Conclusion 51 References 53	-
dc.language.iso	en	-
dc.subject	隱私維護機器學習	-
dc.subject	多方安全計算	-
dc.subject	安全兩方計算	-
dc.subject	隱私維護 Transformer	-
dc.subject	安全 Transformer 推論	-
dc.subject	Privacy Preserving Machine Learning	-
dc.subject	Secure multiparty computation	-
dc.subject	Two-party computation	-
dc.subject	Privacy Preserving Transformer	-
dc.subject	Secure Transformer Inference	-
dc.title	基於多方安全計算實現長序列 Transformer 解碼器模型之高效率安全推論	zh_TW
dc.title	SecLoT: Towards Efficient Secure Inference of Long-sequence Transformer Decoder Models with MPC	en
dc.type	Thesis	-
dc.date.schoolyear	114-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	雷欽隆;李德財	zh_TW
dc.contributor.oralexamcommittee	Chin-Laung Lei;Der-Tsai Lee	en
dc.subject.keyword	隱私維護機器學習,多方安全計算安全兩方計算隱私維護 Transformer安全 Transformer 推論	zh_TW
dc.subject.keyword	Privacy Preserving Machine Learning,Secure multiparty computationTwo-party computationPrivacy Preserving TransformerSecure Transformer Inference	en
dc.relation.page	60	-
dc.identifier.doi	10.6342/NTU202600086	-
dc.rights.note	未授權	-
dc.date.accepted	2026-02-03	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 未授權公開取用	4.21 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。