👩💻 Join our community of thousands of amazing developers!
I’ve recently had to learn a lot about natural language processing (NLP), specifically Transformer-based NLP models. Similar to my previous blog post on deep autoregressive models, this blog post is a write-up of my reading and research: I assume basic familiarity with deep learning, and aim to highlight general trends in deep NLP, instead of commenting on individual architectures or systems. As a disclaimer, this post is by no means exhaustive and is biased towards Transformer-based models, whi...