Accelerate AI algorithms Through RISC-V Processors Grid

ץ

Team Members: Ida yosef, Abergel Nachman

Supervisors / Mentors: Prof. Freddy Gabbay

 

The rise of AI applications has led to increased computational demands, especially in memory workloads. These applications naturally allow for parallel data processing, which can be exploited to improve performance. However, traditional massively parallel microarchitectures like GPUs often require each core to access memory independently and do not allow direct inter-processor communication. This limitation can reduce memory access efficiency, increase memory bandwidth, reduce overall performance and incur power overhead. The main idea behind this project is to design a multi-core RISC-V processor grid that takes advantage of data parallelism while improving data reuse and arithmetic intensity. By allowing cores to share data with their neighbors, the system reduces the number of unneeded memory accesses. This approach aims to lower bandwidth usage, improve energy efficiency and arithmetic intensity.

In this project, we designed an Inter-Processor Router (IPR) to enable efficient inter-processor communication between cores. The IPR was integrated into the PULP platform cluster, which features 32-bit, 4-stage CV32E40P (RI5CY) cores. The IPR is mapped to the memory of the core allowing software to push and pop communication packets. The cluster architecture was used to support inter-core communication, allowing processors to exchange data directly via the IPR, thereby enhancing parallel processing efficiency and reducing redundant memory accesses. We found that fetching data from memory requires 16 clock cycles, whereas accessing data through the IPRs takes only 2 clock cycles. By running programs that leverage IPRs for data transfer, we aim to achieve a 20% reduction in energy consumption.