A technical paper titled “SpikeHard: Efficiency-Driven Neuromorphic Hardware for Heterogeneous Systems-on-Chip” was published by researchers at Columbia University.
“Neuromorphic computing is an emerging field with the potential to offer performance and energy-efficiency gains over traditional machine learning approaches. Most neuromorphic hardware, however, has been designed with limited concerns to the problem of integrating it with other components in a heterogeneous System-on-Chip (SoC). Building on a state-of-the-art reconfigurable neuromorphic architecture, we present the design of a neuromorphic hardware accelerator equipped with a programmable interface that simplifies both the integration into an SoC and communication with the processor present on the SoC. To optimize the allocation of on-chip resources, we develop an optimizer to restructure existing neuromorphic models for a given hardware architecture, and perform design-space exploration to find highly efficient implementations. We conduct experiments with various FPGA-based prototypes of many-accelerator SoCs, where Linux-based applications running on a RISC-V processor invoke Pareto-optimal implementations of our accelerator alongside third-party accelerators. These experiments demonstrate that our neuromorphic hardware, which is up to 89× faster and 170× more energy efficient after applying our optimizer, can be used in synergy with other accelerators for different application purposes.”
Find the technical paper here. Published September 2023.
Clair, Judicael, Guy Eichler, and Luca P. Carloni. “SpikeHard: Efficiency-Driven Neuromorphic Hardware for Heterogeneous Systems-on-Chip.” ACM Transactions on Embedded Computing Systems 22, no. 5s (2023): 1-22.
Programming Processors In Heterogeneous Architectures
Optimizing PPA for different processor types requires very different approaches, and all of them are now included in the same design.
Dealing With Performance Bottlenecks In SoCs
SoCs keep adding processing cores, but they are less likely to be fully utilized because the real bottlenecks are not being addressed.