ARM 處理器上準梅森質數體的快速運算程式碼產生器

Shang-Yi Yang; 楊上逸

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6163

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	鄭振牟
dc.contributor.author	Shang-Yi Yang	en
dc.contributor.author	楊上逸	zh_TW
dc.date.accessioned	2021-05-16T16:22:13Z	-
dc.date.available	2013-07-30
dc.date.available	2021-05-16T16:22:13Z	-
dc.date.copyright	2013-07-30
dc.date.issued	2013
dc.date.submitted	2013-07-23
dc.identifier.citation	[ARM09] ARM Limited. ARM1136JF-S and ARM1136J-S Technical Reference Manual, Revision: r1p5, 2009. http://infocenter.arm.com/help/ topic/com.arm.doc.ddi0211k/DDI0211K_arm1136_r1p5_trm.pdf. [BDL+ 12] Daniel J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang. High-speed high-security signatures. Jour- nal of Cryptographic Engineering, 2(2):77–89, 2012. Document- ID: a1a62a2f76d23f65d622484ddd09caf8, http://cryptojedi.org/ papers/#ed25519. [Ber00] Daniel J. Bernstein. Floating-point arithmetic and message authentica- tion, 2000. [Ber05] Daniel J. Bernstein. The poly1305-aes message-authentication code. In In Proc. FSE, pages 32–49, 2005. [CFR+ 91] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13(4):451–490, October 1991. [Com90] P. G. Comba. Exponentiation cryptosystems on the ibm pc. IBM Syst. J., 29(4):526–538, October 1990. [GPW+04] Nils Gura, Arun Patel, Arvinderpal Wander, Hans Eberle, and Sheuel- ing Chang Shantz. Comparing elliptic curve cryptography and rsa on 8-bit cpus. In CHES’04, pages 119–132, 2004. [HS13] Michael Hutter and Peter Schwabe. Nacl on 8-bit avr microcontrollers. In Amr Youssef and Abderrahmane Nitaj, editors, Progress in Cryptol- ogy – AFRICACRYPT 2013, volume 7918 of Lecture Notes in Computer Science, pages 156–172. Springer-Verlag Berlin Heidelberg, 2013. Doc- ument ID: cd4aad485407c33ece17e509622eb554, http://cryptojedi. org/papers/#avrnacl. [HW11] Michael Hutter and Erich Wenger. Fast multi-precision multiplication for public-key cryptography on embedded microprocessors. In Proceed- ings of the 13th international conference on Cryptographic hardware and embedded systems, CHES’11, pages 459–474, Berlin, Heidelberg, 2011. Springer-Verlag. [JG02] Ghassem Jaberipur and Mohammad Ghodsi. High radix signed digit number systems: Representation paradigms. Scientia Iranica, 10:383– 391, 2002. [Nat00] National Institute of Standards and Technology. FIPS PUB 186-2: Dig- ital Signature Standard (DSS). January 2000. [PS99] Massimiliano Poletto and Vivek Sarkar. Linear scan register allocation. ACM Trans. Program. Lang. Syst., 21(5):895–913, September 1999. [RDPJ10] Norman Ramsey, Joao Dias, and Simon Peyton Jones. Hoopl: a mod- ular, reusable library for dataflow analysis and transformation. In Pro- ceedings of the third ACM Haskell symposium on Haskell, Haskell ’10, pages 121–134, New York, NY, USA, 2010. ACM. [Sch11] Peter Schwabe. High-Speed Cryptography and Cryptanalysis. PhD thesis, Eindhoven University of Technology, 2011. http://cryptojedi.org/ users/peter/thesis/. [WM05] Christian Wimmer and Hanspeter Mossenbock. Optimized interval splitting in a linear scan register allocator. In Proceedings of the 1st ACM/USENIX international conference on Virtual execution environ- ments, VEE ’05, pages 132–141, New York, NY, USA, 2005. ACM.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6163	-
dc.description.abstract	近期高速密碼學研究中，往往透過電腦指令的排列組合來提升運算效率，但如果少了自動化工具，則需要耗費相當大的人力。使用我們提出的工具，只需要準梅森質數作為輸入，就能透過窮舉找出在ARM11上最高效率的模乘法程式。窮舉的參數包含大數的表示方示及程式碼產生器參數，而提出的模乘法演算法則混合了乘法與模餘兩部份，特別適合提升準梅森質數體上的計算效率。使用提出的演算法，自動產生出的高質量程式碼運行時間較GCC編譯器的結果快16.4%，且為GMP模乘法的4至8倍。	zh_TW
dc.description.abstract	Recent research on high-speed cryptography has been striving for performance by twiddling with instructions, but without an automated tool, writing fast software takes much precious labor effort. We present a tool with a simple interface for crypto developers to generate fast modular multiplication routines in a few keystrokes: you provide the prime as the modulus and it produces several candidate results or enumerates them all for benchmark. Specifically, we automatized the choice of number representation and the code generation for multiplication modulo a pseudo-Mesenne prime on ARM11, using the proposed convolved multiplication method, which interleaves multiplication and modular reduction. The high-quality code generated runs up to 16.4% faster than the convolved multiplication compiled by defacto-standard compilers such as gcc, and is 4 to 8 times faster than the GMP modular multiplication.	en
dc.description.provenance	Made available in DSpace on 2021-05-16T16:22:13Z (GMT). No. of bitstreams: 1 ntu-102-R00921035-1.pdf: 1099830 bytes, checksum: 07ab4270a3e626bc684005595d089af5 (MD5) Previous issue date: 2013	en
dc.description.tableofcontents	1 Introduction 1 1.1 Motivation................................. 1 1.2 ProblemStatement............................ 2 1.3 Contributions ............................... 2 2 Preliminaries 4 2.1 Pseudo-MersennePrimes......................... 4 2.2 RadixRepresentations .......................... 5 2.3 Multi-precisionMultiplicationTechniques. . . . . . . . . . . . . . . . 6 2.3.1 Row-WiseMethod ........................ 7 2.3.2 Column-WiseMethod ...................... 7 2.3.3 HybridMethod .......................... 8 3 Convolved Multiplication 10 3.1 TwoExamples............................... 10 3.1.1 FirstExample........................... 11 3.1.2 SecondExample ......................... 12 3.1.3 ReductionandCarryChains................... 14 3.2 Mixed-RadixRepresentation....................... 14 3.3 Formulating Representation Choice Criteria . . . . . . . . . . . . . . 14 4 Multiplication on Convolved Structures 18 4.1 ConvolvedMultiplication......................... 19 5 Implementation 21 5.1 SystemOverview ............................. 21 5.2 TheARM11ProcessorFamily...................... 22 5.2.1 Freeshiftsandrotates ...................... 22 5.2.2 Accessingthecyclecounter. ................... 23 5.3 LinearRegisterAllocation ........................ 23 5.4 Auto-Tuning................................ 24 6 Results and Discussion 26 6.1 ConvolvedMethodonARM11...................... 26 6.2 DrawbacksofTraditionalMethodsonARM11 . . . . . . . . . . . . . 27 7 Conclusion 29
dc.language.iso	en
dc.title	ARM 處理器上準梅森質數體的快速運算程式碼產生器	zh_TW
dc.title	Code Generation for Fast Pseudo-Mersenne Prime Field Arithmetic on ARM Processors	en
dc.type	Thesis
dc.date.schoolyear	101-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	楊柏因,王柏堯,陳郁方
dc.subject.keyword	多精度乘法,模乘法,準梅森質數,快速密碼學,ARM11,	zh_TW
dc.subject.keyword	multi-precision multiplication,modular multiplication,pseudo-Mersenne primes,high-speed cryptography,ARM11,convolved multiplication,	en
dc.relation.page	33
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2013-07-23
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf	1.07 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。