The authors developed a scalable heterogeneous multicore processor. 3D heterogeneous chip stacking of a general-purpose CPU and reconfigurable multicore accelerators enables various trade-offs between performance and energy consumption. The stacked chips interconnect through a scalable 3D network on a chip (NoC). By simply changing the number of stacked accelerator chips, processor parallelism can be widely scaled. No design change is needed, and hence, no additional nonrecurring engineering (NRE) cost is required. An inductive-coupling ThruChip Interface (TCI) is applied to stacked-chip communications, forming a low-cost and robust high-speed 3D NoC. The authors developed a prototype system called Cube-1 with 65-nm CMOS test chips, and confirmed successful system operations, including 10 hours of continuous Linux OS operation. Simple filters and a streaming application were implemented on Cube-1 and performance acceleration up to about three times was achieved.
ASJC Scopus subject areas