Low-Power Division: Comparison Between Implementations of Radix 4, 8 and 16

A. Nannarelli and T. Lang

14th IEEE Symposium on Computer Arithmetic, Adelaide, Australia, April 1999

Abstract - Although division is less frequent than addition and multiplication, because of its longer latency it dissipates a substantial part of the energy in floating-point units. In this paper we explore the relation between the radix and the energy dissipated. Previous work has been done on radix-4 and radix-8 division. Here we extend this study to a radix-16 scheme with two overlapped radix-4 stages and compare the latency, area, and energy of the three implementations.
Results show that by applying the low-power techniques the energy dissipation is reduced from 30% to 40%, with respect to the standard implementation. An additional 20% reduction can be obtained using a dual voltage. Moreover, the energy dissipated to complete the division is roughly the same for the three radices. However, the power dissipation, proportional to the average current, increases with the radix. If reducing the energy is the priority, for the same latency radix-16 with dual voltage produces the smallest energy dissipation.

Document available in:
HTML PostScript PDF