Hewlett Packard PA-8000 (Onyx) processors
Introduction: January 1996
The PA-8000 is the first chip to implement the 64-bit PA-RISC 2.0 architecture which includes many extensions to support 64-bit computing. This includes that all integer registers and functional units (ALU, shift/merge) have been widened to 64-bit, i.e. native 64-bit integer arithmetic. The flat virtual address space is 64-bit wide although most PA-RISC version 2.0 CPUs only support a physical address space of 40-bit. Other extensions include fast TLB insert instructions, memory prefetch instructions, support for variable sized pages, branch prediction hinting and FPMAC (Floating Point Multiply Accumulate) units. The instruction decode logic is not integrated with the functional units’ pipeline logic. This architecture allows the chip to partially decode instructions well in advance of the instruction’s actual execution by the functional unit(s).
A key feature of the PA-8000 is the IRB (Instruction Reorder Buffer). Due to restrictions on compiler scheduling, the design team decided that the CPU should perform its own instruction scheduling. The IRB can store up to 28 computation and 28 load/store instructions; it tracks interdepencies between these instructions and allows execution as soon as the instructions are ready. Branch prediction outcomes are also tracked and due to re-scheduling the CPU can execute instructions past cache misses. The IRB is the key part in the OOO execution capabilty of the chip.
In short, the PA-8000 is a decoupled architecture with four-instruction dispatch and aggressive out-of-order (OoO) execution. It has additionally dual floating-point and dual load/store units, a large OOO dispatch window and, following a long HP tradition, no on-chip caches. The (large) primary caches have been kept off-chip to increase the amount of data that can be accessed in a single cycle. Although the latency of the caches is roughly two cycles this can be hidden with complete pipelining resulting practically in one access per cycle. Nothing in the design of this chip was leveraged from previous chip designs.
PA-8000 was used in C160, C180, D270, D280, D370, D380, J280, J282, K250, K260, K450, K460, R380, T600, Convex SPP2000 (S-Class) and Stratus Continuum 628, 1228.