RSAy - A High Speed RSA Chip
RSAy - A High Speed RSA Chip
Disciplines
Electrical Engineering, Electronics, Information Engineering (60%); Computer Sciences (40%)
Keywords
-
RSA,
CRYPTOGRAPHY,
ENCRYPTION,
VLSI,
CRYPTO-HARDWARE,
Design, fabrication and test of an ultra high speed long integer modulus multiplier chip for cryptographic applications is considered within the RSA gamma project. Its primary application is the Rivest Shamir Adleman (RSA) public key algorithm, so this algorithm acts as a benchmark for this chip. In addition, the chip is also capable of processing other public key algorithms based on long integer modulo multiplication, such as the ElGamal algorithm.
The aims of this project were design and implementation of high-speed hardware for public-key cryptography. One of the most popular public-key cryptosystems is the cryptosystem by Rivest, Shamir, and Adleman (RSA) which can be used for asymmetric de/encryption as well as for generation/verification of digital signatures. RSA public- key cryptography is based on modular exponentiation of very long integers (typically 1024 bits). An important objective of the project was the investigation of different algorithms and multiplier architectures for long-integer modular multiplication and the design of the RSA crypto chip for high-speed RSA de/encryption. Compared to other RSA chips, the RSA combines efficient algorithms with a high-performance multiplier architecture implemented in an advanced circuit technique and design methodology: Implemented algorithms: RSA uses an optimized variant of Barrett`s modular reduction method, termed FastMM algorithm. The FastMM algorithm is very well suited for hardware implementation as it avoids the division in the modular reduction operation and calculates a modular multiplication by three long-integer multiplications and one addition. Furthermore, the RSA crypto chip can exploit the Chinese Remainder Theorem (CRT) to speed up RSA private-key operations. Multiplier architecture: From an architectural point of view, the multiplier on the RSA crypto chip is a partial parallel multiplier (PPM). The developed prototype contains a 1056*16-bit PPM which handles the multiplicand fully parallel and the multiplier sequentially in 16-bit words. Due to its high degree of parallelism, the multiplier core is able to compute a 1024-bit modular multiplication in 227 clock cycles. Circuit technique and design methodology: Although the architecture (theoretically) may accept an arbitrary degree of parallelism, it must be noted that area and power resources are limited on a single chip. Therefore, the goal of achieving optimum performance involves low-power as well as low-area design. The RSA datapath is implemented in True Single Phase Clocked (TSPC) logic to simplify the clock generation and clock distribution. Most parts of the multiplier core were realized in a full-custom design methodology. The results of this project have been published in six international, peer-reviewed conference proceedings and journals, respectively. Furthermore, five presentations at major international research conferences have been given.
- Technische Universität Graz - 100%