Four complete designs for the TEA-algorithm and a implementation of RC5 were evaluated, in addition to a base-line design with key-search hardware but no cryptographic algorithm. The results are shown in tables 3 and 4.
Tea1 is automatically generated fully-unrolled code generated from the TIGER-language program of appendix A.1. The Synopsys tools were not able to properly handle the large number of temporary variables generated, and so there are no detailed speed or complexity figures for this design.
Tea2 is hand-written code which uses a VHDL for-loop to allow the Synopsys tools to unroll the 32 TEA rounds. The Synopsys tools were able to evaluate the design and give a clock speed estimate, but the design was too large to fit in a single Xilinx XC4010.
Tea3 is hand-written code for an iterative state-machine implementation of TEA, used as a feasibility test for automatic generation of sequential circuits. The resulting design was able to be squeezed into a Xilinx XC4010, with only 10 CLBs to spare.
Tea4 is an automatically generated state machine compiled from the same TIGER program as Tea1, with loop unrolling turned off. It also fit into a Xilinx XC4010, barely. Performance is roughly equivalent to Tea3.
RC5 is an automatically generated state machine compiled from the TIGER program of appendix A.2. The TIGER program needed to be unrolled to eliminate array references, but still uses a iterative state machine in subkey generation. The design does not fit in a Xilinx XC4010.
Algorithm | Code Gen. | Cycles/key | Clock Rate | Keys/s |
Baseline | None | 1 | 18 MHz | -- |
Tea1 | Automatic | 1 | -- | -- |
Tea2 | Manual | 1 | 514 kHz | 514k |
Tea3 | Manual | 34 | 13.5 MHz | 397k |
Tea4 | Automatic | 34 | 10.1 MHz | 297k |
RC5 | Automatic | 5 | -- | -- |
Algorithm | Code Gen. | Total CLBs | 2cXilinx XC4010 CLB Use | |
Baseline | None | 57 | 95/400 | 23% |
Tea1 | Automatic | -- | -- | -- |
Tea2 | Manual | 5814 | -- | -- |
Tea3 | Manual | 2187 | 390/400 | 97% |
Tea4 | Automatic | 364 | 399/400 | 99% |
RC5 | Automatic | -- | -- | -- |
Reflects Synopsys figures before target-specific
optimization and routing.
Reported by the Xilinx tools after successful
place-and-route.