Headstarter GPT Workshop Series

A hands-on workshop series building up to GPT from scratch, following Andrej Karpathy's Neural Networks: Zero to Hero course. Organized by Headstarter and led by Saad Jamal.

Overview

This series walks through the core concepts of neural networks step by step: starting from what a derivative is and ending with training language models. Each lecture is a self-contained Jupyter notebook with code, explanations, and visualizations.

Lectures

Lecture 1: Micrograd & Backpropagation

lecture1.ipynb

Derivatives from first principles: limit definition, slopes, and signs
Value object: building an autograd scalar with operator overloading (+, *, tanh, exp, pow)
Computation graphs: visualizing the DAG with Graphviz
Chain rule & backpropagation: manually computing gradients, then automating it with topological sort
Single neuron: forward pass through w * x + b with tanh activation
Multi-layer perceptron: Neuron, Layer, and MLP classes from scratch
Training loop: forward pass, MSE loss, backward pass, gradient descent
PyTorch comparison: verifying gradients match PyTorch's autograd

Lecture 2: Bigram Language Model

lecture2.ipynb

Character-level language modeling: predicting the next character from the previous one
Bigram statistics: counting character pair frequencies from a 32K name dataset
Probability distributions: normalizing counts, sampling with torch.multinomial
Broadcasting semantics: practical exercises with PyTorch tensor operations
Maximum likelihood estimation: log likelihood, negative log likelihood as a loss function
Smoothing: handling zero-count bigrams with additive smoothing
Neural network approach: one-hot encoding, logits, softmax, and gradient descent to learn the same bigram model
Regularization: penalizing large weights to produce smoother distributions
Teaser for next lecture: building the dataset for a trigram / MLP model (Bengio et al., 2003)

Getting Started

Resources

Neural Networks: Zero to Hero (YouTube): Andrej Karpathy
micrograd: Karpathy's autograd engine
A Neural Probabilistic Language Model (Bengio et al., 2003)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
GPT-Workshop-Series-Img.png		GPT-Workshop-Series-Img.png
README.md		README.md
lecture1.ipynb		lecture1.ipynb
lecture2.ipynb		lecture2.ipynb
lecture3.ipynb		lecture3.ipynb
lecture3_complete.ipynb		lecture3_complete.ipynb
names.txt		names.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Headstarter GPT Workshop Series

Overview

Lectures

Lecture 1: Micrograd & Backpropagation

Lecture 2: Bigram Language Model

Getting Started

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Headstarter GPT Workshop Series

Overview

Lectures

Lecture 1: Micrograd & Backpropagation

Lecture 2: Bigram Language Model

Getting Started

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages