Field programmable gate array based convolution neural network hardware accelerator with optimized memory controller
Convolution Neural Network (CNN) is a special kind of neural network that is inspired by the behaviour of optic nerves in living creatures. CNN is gaining more and more attention nowadays because of the increased demand for high speed and lowcost synthetic vision systems. However, CNN can be both co...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/92997/1/MohammedIsamEldinMSKE2020.pdf http://eprints.utm.my/id/eprint/92997/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:135883 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Convolution Neural Network (CNN) is a special kind of neural network that is inspired by the behaviour of optic nerves in living creatures. CNN is gaining more and more attention nowadays because of the increased demand for high speed and lowcost synthetic vision systems. However, CNN can be both compute- and memoryintensive. For that reason, implementation in a general-purpose processor will be slow and inefficient. Therefore, this project proposes a flexible CNN hardware accelerator that targets the Field Programmable Gate Array (FPGA) platform and features an optimized memory controller to reduce redundancy memory access. The main advantage of this project is that the accelerator is flexible - meaning that the user of the accelerator has the capability of modifying the architecture using parameterization to optimize for execution speed, resource utilization, and power consumption. The accelerator employs various hardware design techniques like loop unrolling, pipelining, optimized memory controller, and others to achieve the targeted performance. The accelerator is written in System Verilog language using Xilinx’s Vivado software and is tested using a single convolution layer from several selected CNN architectures. Then, it is compared against the same convolution layer implemented in Matlab. The proposed accelerator shows a huge speedup compared to the software counterpart of up to 4251X speed up with reasonable resource utilization and consumes only 0.27 W per layer. |
---|