Recent advances in deep learning and exponential growth in the use of machine learning across application domains have made AI acceleration critically important. IBM Research has been building a pipeline of AI hardware accelerators to meet this need. At the 2018 VLSI Circuits Symposium, we presented a multi-TeraOPS accelerator core building block that can be scaled across a broad range of AI hardware systems. This digital AI core features a parallel architecture that ensures very high utilization and efficient compute engines that carefully leverage reduced precision.