Learning to Route by Task for Efficient Inference

3 · Google AI Research · Jan. 14, 2022, 7:24 p.m.
Posted by Sneha Kudugunta, Research Software Engineer and Orhan Firat, Research Scientist, Google Research Scaling large language models has resulted in significant quality improvements natural language understanding (T5), generation (GPT-3) and multilingual neural machine translation (M4). One common approach to building a larger model is to increase the depth (number of layers) and width (layer dimensionality), simply enlarging existing dimensions of the network. Such dense models take an inp...