Understanding TRANSFORMERS — Leaving RNN’s behind

Shivani Jadhav
1 min readJul 10, 2020

This blog is about useful resources to understand transformers.

I have attached the links in the order that I have read them:

  1. https://rubikscode.net/2019/07/29/introduction-to-transformers-architecture/ This one first starts with giving an overview about RNN’s, attention, self-attention and then discusses the basic concepts of transformers.
  2. http://jalammar.github.io/illustrated-transformer/ This is the most important blog post on transformers. It consists of an in-depth discussion on Transformers
  3. https://glassboxmedicine.com/2019/09/07/universal-transformers/ This gives an overview of transformers first and the structure of the post is very simple and easy to understand. It then explains Universal Transformers with some animation.
  4. https://kazemnejad.com/blog/transformer_architecture_positional_encoding/ This one explains the positional encoding which is used in the early steps in the encoder and decoder part of the transformer( I just skimmed through it!).

I would recommend you guys to understand transformers and universal transformers together i.e in continuation.

After going through these posts, it is much easier to understand the paper on Transformers “Attention is all you need”[1] and the one on Universal Transformers “Universal transformers”[2].

References

[1] Vaswani, Ashish, et al. “Attention is all you need.” Advances in neural information processing systems. 2017.

[2] Dehghani, Mostafa, et al. “Universal transformers.” arXiv preprint arXiv:1807.03819 (2018).

--

--

Shivani Jadhav

Data scientist Expertise : Information retrieval Area of interest-NLP, Machine learning, deep learning