Constructive Synthesis of Memory-Intensive Accelerators for FPGA From Nested Loop Kernels
Data(s) |
15/08/2016
|
---|---|
Resumo |
Field-programmable gate arrays are ideal hosts to custom accelerators for signal, image, and data processing but de- mand manual register transfer level design if high performance and low cost are desired. High-level synthesis reduces this design burden but requires manual design of complex on-chip and off-chip memory architectures, a major limitation in applications such as video processing. This paper presents an approach to resolve this shortcoming. A constructive process is described that can derive such accelerators, including on- and off-chip memory storage from a C description such that a user-defined throughput constraint is met. By employing a novel statement-oriented approach, dataflow intermediate models are derived and used to support simple ap- proaches for on-/off-chip buffer partitioning, derivation of custom on-chip memory hierarchies and architecture transformation to ensure user-defined throughput constraints are met with minimum cost. When applied to accelerators for full search motion estima- tion, matrix multiplication, Sobel edge detection, and fast Fourier transform, it is shown how real-time performance up to an order of magnitude in advance of existing commercial HLS tools is enabled whilst including all requisite memory infrastructure. Further, op- timizations are presented that reduce the on-chip buffer capacity and physical resource cost by up to 96% and 75%, respectively, whilst maintaining real-time performance. |
Formato |
application/pdf |
Identificador | |
Idioma(s) |
eng |
Direitos |
info:eu-repo/semantics/openAccess |
Fonte |
Milford , M & McAllister , J 2016 , ' Constructive Synthesis of Memory-Intensive Accelerators for FPGA From Nested Loop Kernels ' IEEE Transactions on Signal Processing , vol 64 , no. 14 , pp. 4152-4165 . DOI: 10.1109/TSP.2016.2566608 |
Tipo |
article |