Distributing control of deep learning training delivers 10x performance improvement

My IBM Research AI team and I recently completed the first formal theoretical study of the convergence rate and communications complexity associated with a decentralized distributed approach in a deep learning training setting. The empirical evidence proves that in specific configurations, a decentralized approach can result in a 10x performance boost over a centralized approach without additional complexity. A paper describing our work has been accepted for oral presentation at the NIPS 2017 Conference, one of the 40 out of 3240 submissions selected for this.