Transformer Tutorial Series
In this session, we walked through the architecture, training and applications of transformers (slides), the lecture slides covered
- The basic principles of NLP
- Basics of attention mechanism and transformer
- Training language models (language modelling objective)
- Usage of pretrained models (finetuning vs prompting)
- Application of transformer beyond language (vision, audio, music, image generation, game&control)
Jupyter Notebook Tutorial Series
We prepared this series of jupyter notebooks for you to gain hands-on experience about transformers from their architecture to the training and usage.
-
Fundamentals of Transformer and Language modelling
-
Understanding Attention & Transformer from Scratch
In this tutorial, you will manually implement attention mechanism, and GPT model from scratch to gain a deeper understanding of their structure. -
Language modelling and pretrained transformers
In this notebook you will look into the architectures of pretrained transformer (GPT / BERT), and then train a GPT2 model to "speak" the simplified English constructed with Context Free Generative Grammar, and observe the learning of syntactical rule and word meaning.
-
Understanding Attention & Transformer from Scratch
-
Beyond Language:
in the following notebooks, we will demonstrate the flexibility of the transformer model by-
Learn to do arithmetics by sequence modelling.
In this notebook, you will train a GPT2 on arithmetic dataset, and let it learn to do arithmetics (partially) by next token prediction. -
Image generation by sequence modelling.
In this notebook, you will train a GPT2-like transformer for generative modelling of MNIST images, by predicting the sequence of patches in an image. -
Audio signal classification (~ 20 min)
In this notebook, you will train a transformer on Spoken MNIST dataset, and classify the audio sequences. -
Image classification (~ 30 min)
In this notebook, you will train a transformer on images -- formated as a sequence of patches, and predict the identity of the image. -
Music generation by sequence modelling. (Difficult, training takes hrs)
In this notebook, you will train a transformer to predict next note in a music dataset consists of piano rolls. By doing so it could be used to generate classic piano music.
-
Learn to do arithmetics by sequence modelling.
-
Using Large Language Model
Finally we will get a glimpse at the LLMs, by using OpenAI APIs to achieve some useful things-
OpenAI API and Chat with PDF :
In this notebook, you will use the OpenAI API and langchain to build a bot that can chat with a given document e.g. scientific paper . (replicating the functionality of Chat with PDF)
-
OpenAI API and Chat with PDF :
- Official Github repo
Related material
- Attention & Transformers
- Usage of LLM
Related ML from Scratch tutorials
Class:
Machine Learning from Scratch