Improving Sparse Training with RigL

4 · Google AI Research · Sept. 16, 2020, 6:23 p.m.
Posted by Utku Evci and Pablo Samuel Castro, Research Engineers, Google Research, Montreal Modern deep neural network architectures are often highly redundant [1, 2, 3], making it possible to remove a significant fraction of connections without harming performance. The sparse neural networks that result have been shown to be more parameter and compute efficient compared to dense networks, and, in many cases, can significantly decrease wall clock inference times. By far the most popular method f...