Originally Published in 1991 (Computer Jagat Magazine)
M. Lutfar Rahman and M. Alamgir Hossain
INTRODUCTION
The 80860 microprocessor manufactured by Intel Corporation of USA is the world’s first 64-bit single chip microprocessor. It is also Intel’s first reduced instruction set computer (RISC) microprocessor. This chip is likely to provide supercomputing power to the desktop personal computers.
The 80860 microprocessor (popularly known as i860) is designed for numerical and vector intensive applications. Many of the design principles used have been adopted for supercomputer technology enabling the i860 to deliver a peak arithmetic performance of 80 MFLOPS (million floating point operations per second) for single precision data and 60 MFLOPS for double precision data in conjunction with a peak integer performance of 40 MIPS (million instructions per second). In particular, its high throughput is achieved from a combination of RISC design technique, pipelined processing units, wide data paths and large on-chip caches.
Implemented on a single chip with over 1,000,000 transistors, the i860 supports a 64-bit architecture and is capable of executing up to three operations each clock cycle (25 ns @ 40 MHz).
MAIN FEATURES
On a single chip the processor supports the following facilities:
* intger operations
* floating point operations
* graphics operations
* memory management support
* data cache and instruction cache
* high speed multiprocessing
* three-dimensional workstation support
* clock speed : 33 MHz, 40 MHz and 50 MHz
* as a co-processor to 80×86 processors
* 168-pin ceramic package
* CHMOS-IV semiconductor technology
Several i860 processors may be made to work in parallel to realise a minisuper computer. The i860 can be employed to realise a high power graphics workstation and to bring mainframe power to the personal computer.
The ability to provide all these facilities, all on the same chip, enables hardware developers to create products which are less dependent on external components normally associated with sophisticated computer systems. Considering these points of views the microprocessor has some similarities with the transputers. The i860 is an ideal candidate for integration into highly parallel computer- environments, such as, high computational performance, modularity, and real-state requirements.
ARCHITECTURE
The chip is a multiexecution system integrating several units on a single chip (Fig. 1). The main functional units are : the RISC integer processor unit, a 64-bit floaing point unit and a three-dimensional graphics processor unit. The other major units are: a paging unit, a data cache, an instruction cache, a bus and cache control unit, a 80×68 compatible memory management unit and three register flies.
CORE EXECUTION UNIT
The i860 is centrally controlled ‘by the (integer) core unit which is known as the administrator of the processor. It is responsible for fetching both integer and floating point instructions and decoding and executing integer, logical, control-transfer, load/store, exception handling and cache flushing instructions. Instructions are fetched into the core execution unit from the instruction cache. If any address location is not in the cache (a cache miss), the instruction is fed to core execution unit from external memory, while the corresponding instruction cache is simultaneously filled.
|
FETCH4 |
DEC4 |
EXE-4 |
STORE-4 |
FETCH-8 |
|
|
|
|
FETCH-3 |
DEC-3 |
EXE-3 |
STORE-3 |
FETCH-7 |
06C-7 |
|
|
|
FETCH-2 |
DEC-2 |
EXE-2 |
STOBE-2 |
FETCH-6 |
DEC-C |
EXE-6 |
|
|
FETCH-1 |
OEC-1 |
EXE-l |
STOTE-1 |
FETCH-5 |
OEC-S |
EXE-5 |
STORES |
The core unit uses a pipeline organization. The four-stage pipeline operations are shown in Fig.2. When one instruction is fetched, the previous insinuation is decoded, the one before that is executed and the results for its predecessor is stored. This processor has been designed according to RISC principles to maximize performance; instructions are purposefully simple and appear to operate in one clock cycle. Furthermore, the use of register bypassing and score-boarding techniques allow the load and store instructions to be executed at a sustained rate of one instruction every clock cycle, assuming that data and instructions are found in their respective caches. At this rate the integer core unit delivers 40 MIPS of integer type with a 40 MHz clock.
FLOATING POINT UNIT
The floating point unit contains a control unit, an adder, and a multiplier. Operations can be executed in scalar or pipeline mode in the adder, and in pipeline mode in the multiplier. In scalar mode, new operations are not started until the previous ones are completed. In pipelined mode up to three instructions can be overlapped and executed concurrently at any time in the adder and two in the multiplier. With the support of the instruction and data caches, the floating point unit is capable of executing two single precision floating point operations, one add and one multiply, every clock cycle; this is equivalent to 80 MFLOPS with a 40 MHz clock. An efficient implementation of multiply-accumulate operations makes the i860 well suited for a wide range of numerically intensive application areas including :
• matrix manipulations (e.g. solving linear equations)
* series calculations (e.g. expansion series)
* signal processing calculations (e.g. fast Fourier transformation)’
• graphics (e.g. coordinate transformations)
Floating-point data types, floating point instructions, and exception handling support the IEE san-dared for binary floating point arithmetic for both single and double precision data types. A complete set of traps includes tests for Invalid source operands such as NaN (not a number), denormalised numbers, infinities as well as tests for errors in the result such as overflow and underflow.
OTHER UNITS
The graphics unit includes a special 64-bit integer logic module which supports three dimensional graphics algorithms and a special purpose MERGE register. Like 80386, the i860 can support 64 Terabytes virtual memory. The memory management unit is used to translate the logical address to physical address as and when required to access data and instructions in the memory.
The i860 supports a 64-bit (8 bytes) external data bus, a 128-bit internal data bus (two 64-bit paths between data cache and floating point controller) and a 64-bit internal instruction bus. Memory accesses for instructions and data take place through the caches. Each of the data cache and instruction caches is a associative memory of 4 KB with 32-byte blocks. A cache controller uses pipelined structure to provide interface to the external world.
REGISTER SET
Programmes are developed using the user ac-cesible registers (Fig. 3) of the processor. The i860 has the following user accessible registers.
* An integer register file
• A floating point register file
• Six control registers : psr (processor status register), epsr (extended psr), db (data breakpoint register), dirbase (directory base register) and fsr (floating point status register).
* Four special purpose registers :KR, KI, T and MEERGE.
The integer register file contains 32-bit wide 32 integer registers: rO-r32. The floating point register file consists, of 32-bit wide 32 floaing point registers : fO-f31. The registers rO, fO and f 1 always return zero on read. The floating point registers can also be used for integer operations. The floating-point register file can be accessed as sixteen 64-bit registers or eight 128-bit registers. These registers support either 32-bit single precision or 64-bit double precision floating point operations. Like 80386, the i860 supports standard data types which are signed and unsigned 32-bit integers and 32-bit floating points. Besides, the i860 supports a new data type known as the pixel which can be 8, 16 or 32 bits long.
|
Table 1 : Comparison of 80860 with 80×86 processors :
|
||
|
Topics |
80860 |
80×60 |
|
Introduced in |
1989-90 |
1986-87 with 80386, 80486 and 80586 later |
|
Pin count |
168 |
132 for 80386 |
|
External data bus |
64 bits |
32 bits |
|
Architecture |
RISC |
CISC |
|
Transistors |
1,000,000 |
275,000 in 80386 |
|
Technology |
CHMOS-IV |
CHMOS-Ili |
|
Graphics support |
Yes |
No |
|
Clock (MHz) |
33,40,50 |
12, 16 for 80386 25 for 80486 (higher for later versions On average 4.5 clock cycles per instruction for 80386, 80486 has twice the speed of i386, i586 has twice the speed of i486 |
|
Operations |
1 to 3 operations |
|
|
|
per cycle |
|
The i860 uses different instruction sets for different types of operations, such as, the core unit instruction set, the floating point instruction set, the graphics instruction set, and the assembler pseudo
operation and floating point pipelined operations. It is to be noted that the 80×86 processors do not use different instruction sets.
COMPARISON WITH OTHER PROCESSORS
Table 1 shows the comparison of i860 processor to the other advanced Intel processors. The external address bus of i860 is 32-bit wide, so like 80386 it can address 4 Gigabyte real addresses and 64 Terabyte virtual addresses. Considering memory organization and integer and floating point data structure, i860 is compatible with 80×86 family. But i860 is far advanced than 80×86 family if technology speed and some other features are taken into account. At 40 MHz clock speed the i860 can operate at 20 MOPS; this is comparable to the performance of some supercomputers, such as, Cray. Table 2 shows the comparison of 80860 processor with some mainframe computers. Unlike 80×86 processors, the i860 is based on RISC design and parallel mode operation. As a result the i860 is not software compatiple to 80×86 processors.
The performance of i860 is better than some other RISC processor such as SPARC and R3000. The i860 based systems are likely to bring supercomputing power to the desktop computers in the years to come.
|
|
Table 2 : Comparison of i860 and other mainframe computers. |
|||
|
|
Intel i860 |
Texas Inst. |
Cray Resar. |
Control Data |
|
|
|
ASC |
Cray-1 |
Star-lOO |
|
Word (bits) |
64ext. |
64 |
64 |
64 |
|
|
128 int. |
|
|
|
|
Clock |
33, 40, 50 |
About |
About |
20 (GHz) |
|
|
(MHz) |
6.6 (GHz) |
0.8 (GHz) |
|
|
Max. primary |
229 words each |
130 million |
523 million |
264 million |
|
real memory |
of 64 bits |
64-bit words |
64-bit words |
64-bit words |
|
support |
|
|
|
|
