Introduction to Transformer

In this seminar, we first introduced the fundamental Transformer model, the training of GPT and BERT, as well as ViT (Vision Transformer) and linear attention. Additionally, we also discussed the Mamba model.Here is the lecture notes and vedio.

Transformer