![]() The results show that LACS achieves an extremely high computational efficiency, 98.51% when executing AlexNet and 99.66% when executing VGG-16, significantly exceeding state-of-the-art accelerators. Finally, we deploy LACS on a field-programmable gate array (FPGA) chip and analyze the factors that affect the computational efficiency of LACS. We reduce the amount of data movements between registers and the on-chip buffer from O(k×k) to O(k) by a bypass buffer mechanism. To adapt to different CNNs and to eliminate the influences of padding-zero operations, filling-zero operations and stride length on the computational efficiency of the accelerator, we draw upon the computation pattern of CPUs to design an efficient and versatile CNN accelerator, LACS (Loading-Addressing-Computing-Storing). With the increasing depth of CNNs, the invalid calculations caused by padding-zero operations, filling-zero operations and stride length (stride length>1) represent an increasing proportion of all calculations. The techniques investigated in this review are expected to direct future research challenges of hardware optimization for high-performance computations.Ĭonvolutional neural networks (CNNs) have become continually deeper. In this review, we illustrate the recent developments in hardware for accelerating the efficient implementation of deep learning networks with enhanced performance. Similarly, different algorithms have been adapted to design a DSP processor compatible for fast performance in neural network, activation function, convolutional neural network and generative adversarial network. ![]() Hence, novel dark silicon constraint is developed to reduce the heat dissipation without sacrificing the accuracy. Moreover, if a huge silicon area is powered to accelerate the operation using parallel computation, the ICs will be having significant chance of burning out due to the considerable generation of heat. Presently available DSP processors are incapable of performing these operations and they mostly face the problems, for example, memory overhead, performance drop and compromised accuracy. These operations involve huge calculation and memory overhead. Recently, neural network and deep learning has been started to impact the present research paradigm significantly which consists of parameters in the order of millions, nonlinear function for activation, convolutional operation for feature extraction, regression for classification, generative adversarial networks. To meet these needs, the hardware architecture should be reliable and robust to these problems. In recent times, the trend in very large scale integration (VLSI) industry is multi-dimensional, for example, reduction of energy consumption, occupancy of less space, precise result, less power dissipation, faster response. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |