摘要
:
With the rapid development of Artificial Intelligence, Internet of Things, 5G, and other technologies, a number of emerging intelligent applications represented by image recognition, voice recognition, autonomous driving, and inte...
展开
With the rapid development of Artificial Intelligence, Internet of Things, 5G, and other technologies, a number of emerging intelligent applications represented by image recognition, voice recognition, autonomous driving, and intelligent manufacturing have appeared. These applications require efficient and intelligent processing systems for massive data calculations, so it is urgent to apply better DNN in a faster way. Although, compared with GPU, FPGA has a higher energy efficiency ratio, and shorter development cycle and better flexibility than ASIC. However, FPGA is not a perfect hardware platform either for computational intelligence. This paper provides a survey of the latest acceleration work related to the familiar DNNs and proposes three new directions to break the bottleneck of the DNN implementation. So as to improve calculating speed and energy efficiency of edge devices, intelligent embedded approaches including model compression and optimized data movement of the entire system are most commonly used. With the gradual slowdown of Moore's Law, the traditional Von Neumann Architecture generates a "Memory Wall" problem, resulting in more power-consuming. In-memory computation will be the right medicine in the post-Moore law era. More complete software/hardware co-design environment will direct researchers' attention to explore deep learning algorithms and run the algorithm on the hardware level in a faster way. These new directions start a relatively new paradigm in computational intelligence, which have attracted substantial attention from the research community and demonstrated greater potential over traditional techniques.
收起