From Transformer to LLM: Architecture, Training and Usage

Transformer Tutorial Series

Attention

In this session, we walked through the architecture, training and applications of transformers (slides), the lecture slides covered

  • The basic principles of NLP
  • Basics of attention mechanism and transformer
  • Training language models (language modelling objective)
  • Usage of pretrained models (finetuning vs prompting)
  • Application of transformer beyond language (vision, audio, music, image generation, game&control) 

Jupyter Notebook Tutorial Series

We prepared this series of jupyter notebooks for you to gain hands-on experience about transformers from their architecture to the training and usage. 

  • Fundamentals of Transformer and Language modelling
  • Beyond Language:
    in the following notebooks, we will demonstrate the flexibility of the transformer model by 
    •  Learn to do arithmetics by sequence modelling.
      In this notebook, you will train a GPT2 on arithmetic dataset, and let it learn to do arithmetics (partially) by next token prediction. 
    •  Image generation by sequence modelling.
      In this notebook, you will train a GPT2-like transformer for generative modelling of MNIST images, by predicting the sequence of patches in an image. 
    •  Audio signal classification (~ 20 min)
      In this notebook, you will train a transformer on Spoken MNIST dataset, and classify the audio sequences. 
    •  Image classification  (~ 30 min)
      In this notebook, you will train a transformer on images -- formated as a sequence of patches, and predict the identity of the image. 
    • Music generation by sequence modelling. (Difficult, training takes hrs)
      In this notebook, you will train a transformer to predict next note in a music dataset consists of piano rolls. By doing so it could be used to generate classic piano music. 
  • Using Large Language Model
    Finally we will get a glimpse at the LLMs, by using OpenAI APIs to achieve some useful things
  • Official Github repo

ChatPDF

Related material 

Related ML from Scratch tutorials

Class: 

Machine Learning from Scratch

Attach Files: