Breaking Shackles: Hardware Acceleration Beyond the CPU

  • Suhaib Fahmy, Reader, School of Engineering, University of Warwick, UK
-

KAUST

This talk discusses a body of work on exploiting the DSP Blocks in modern FPGAs to construct high-performance datapaths including the concept of FPGA overlays. It outlines work that established FPGAs as a viable virtualized cloud acceleration platform, and how the industry has adopted this model. Finally, it discusses recent work on incorporating accelerated processing in network controllers and the emerging concept of in-network computing with FPGAs. These strands of work come together to demonstrate the value of thinking about computing beyond the CPU-centric view that still dominates.

Overview

Abstract

The accelerating evolution of algorithms and techniques in popular application domains, coupled with the exponentially rising cost and complexity of fabricating custom silicon hardware, is driving interest in alternative hardware acceleration platforms that offer more flexibility and faster time to market. While large corporations can spin up a design cycle at short notice to address a pressing high demand application, such as Google has done with the TPU architecture, this model is not generalizable. Field Programmable Gate Arrays (FPGAs) have a long history of use in accelerating signal processing applications as they allow the design of custom architectures that reflect the inherent parallelism in those algorithms, leading to highly performant and efficient computation. More recently, FPGAs have expanded in capability with embedded computational elements, tighter high-throughput interfacing, and hybrid System-on-Chip (SoC) architectures. Hence, they have begun finding favor as general hardware accelerators in a variety of contexts. This talk discusses a body of work on exploiting the DSP Blocks in modern FPGAs to construct high-performance datapaths including the concept of FPGA overlays. It outlines work that established FPGAs as a viable virtualized cloud acceleration platform, and how the industry has adopted this model. Finally, it discusses recent work on incorporating accelerated processing in network controllers and the emerging concept of in-network computing with FPGAs. These strands of work come together to demonstrate the value of thinking about computing beyond the CPU-centric view that still dominates.

Brief Biography

Suhaib Fahmy is Reader (Associate Professor) in Computer Engineering at the University of Warwick, where he leads the Connected Systems Research Group and Adaptive Reconfigurable Computing Lab. He is also a Turing Fellow at The Alan Turing Institute in London. He graduated with MEng and Ph.D. degrees from Imperial College London in 2003 and 2008, respectively. After a postdoc at Trinity College Dublin in a collaboration with Xilinx Research Labs, Ireland, he moved to Nanyang Technological University Singapore in 2009, before returning to the UK in 2015. His research explores the use of reconfigurable architectures for accelerating complex computations and tighter coupling of computation and communication in embedded and larger networks. He received the Best Paper Award at the IEEE Conference on Field-Programmable Technology (FPT) in 2012, an IBM Faculty Award in 2013 and 2017, the Community Award at the International Conference on Field-Programmable Logic and Applications (FPL) in 2016, and the ACM Transactions on Design Automation of Electronic Systems Best Paper Award in 2019. He serves on the ACM Technical Committee on FPGAs and Reconfigurable Computing as well as having chaired multiple conferences. Dr Fahmy is a Senior Member of the IEEE, Senior Member of the ACM, Chartered Engineer and Member of the IET, and Fellow of the Higher Education Academy.

Presenters

Suhaib Fahmy, Reader, School of Engineering, University of Warwick, UK