🧠
Frontier LLM V3
12 Parts Β· 41 Chapters
πŸ—ΊοΈTerritory Mapβ–Ύ
Ch 0 β€” The Map of the Territory
🎯⚑
βˆ‘Mathematical Foundationsβ–Ύ
Ch 1 β€” Vectors
🎯⚑
Ch 2 β€” Matrices
Ch 3 β€” Tensors
Ch 4 β€” Dot Product
🎯⚑
Ch 5 β€” Information Theory
Ch 6 β€” Cross Entropy Loss
🎯
Ch 7 β€” Optimization
🧠Neural Networksβ–Ύ
Ch 8 β€” The Artificial Neuron
Ch 9 β€” Activation Functions
Ch 10 β€” Residual Connections
⚑Transformer Architectureβ–Ύ
Ch 11 β€” Tokenization
Ch 12 β€” Embeddings
Ch 13 β€” Attention
🎯⚑
Ch 14 β€” The Transformer Block
Ch 15 β€” Decoder-Only Architecture
πŸ“ˆHow Models Learnβ–Έ
🎯Alignment Systemsβ–Έ
βš™οΈInference Systemsβ–Έ
πŸ“ŠEvaluation Systemsβ–Έ
πŸ–₯️Distributed Trainingβ–Έ
πŸ”€Mixture of Expertsβ–Έ
πŸ’­Reasoning Modelsβ–Έ
πŸ€–Agents, RAG & MCPβ–Έ
πŸš€Building Frontier AIβ–Έ
Chapters read0/36
Version 3.0 Β· 2026

The Frontier LLM Curriculum

12 Parts Β· 41 Chapters Β· From Fundamentals to Frontier Systems

36
Chapters
12
Parts
10
Quizzes
27
Terms
πŸ“ Choose a Learning Path
Novice Path
Ch 0 β†’ Read straight through the stack
Engineer Track
Focus on Transformer + Production infrastructure
Researcher Route
Deep dive into Training, MoE, Reasoning
Executive View
Strategy, Competitive Moats, Frontier Thesis
πŸ—ΊοΈPart 0 β€” Territory Map
⚑🎯
Ch 0
The Map of the Territory
The complete vertical stack β€” from raw data to AI-operated organizations
βˆ‘Part 1 β€” Mathematical Foundations
⚑🎯
Ch 1
Vectors
Words become points. Meaning becomes geometry.
Ch 2
Matrices
All of AI is matrix multiplication
Ch 3
Tensors
Multi-dimensional arrays that flow through every neural network
⚑🎯
Ch 4
Dot Product
The most important operation in AI β€” measures similarity between vectors
Ch 5
Information Theory
The mathematics of surprise, uncertainty, and prediction
🎯
Ch 6
Cross Entropy Loss
The heartbeat of training β€” measures how wrong the model is
Ch 7
Optimization
Finding the lowest valley in a billion-dimensional landscape
🧠Part 2 β€” Neural Networks
Ch 8
The Artificial Neuron
A mathematical simplification of biological neurons
Ch 9
Activation Functions
Without them, the entire network collapses to a single linear operation
Ch 10
Residual Connections
Skip connections that enabled 100+ layer networks
⚑Part 3 β€” Transformer Architecture
Ch 11
Tokenization
Converting text into numbers β€” the first step of every LLM
Ch 12
Embeddings
Tokens become vectors; vectors become meaning
⚑🎯
Ch 13
Attention
The invention that changed AI β€” every token can see every other token simultaneously
Ch 14
The Transformer Block
The repeating unit β€” stacked 32 to 120+ times in frontier models
Ch 15
Decoder-Only Architecture
The dominant architecture of all modern LLMs
πŸ“ˆPart 4 β€” How Models Learn
⚑🎯
Ch 16
Backpropagation
The engine of learning β€” how errors flow backwards through the network
🎯Part 5 β€” Alignment Systems
🎯
Ch 17
RLHF
Reinforcement Learning from Human Feedback β€” what made ChatGPT useful
Ch 18
Constitutional AI
Anthropic's approach: principles over pure human labeling
Ch 19
DPO
Direct Preference Optimization β€” simpler and cheaper than RLHF
Ch 20
RLAIF
AI judging AI β€” scales where human feedback cannot
βš™οΈPart 6 β€” Inference Systems
Ch 21
What Is Inference?
Training creates the model. Inference makes the money.
⚑🎯
Ch 22
KV Cache
The single most important inference optimization β€” 10–100Γ— speedup
Ch 23
FlashAttention
2–4Γ— faster attention by solving the memory bandwidth bottleneck
Ch 24
vLLM & Continuous Batching
Modern serving: keep GPUs at 100% utilization
πŸ“ŠPart 7 β€” Evaluation Systems
Ch 27
Evaluation Systems
Without rigorous evaluation, you cannot know if you are improving
πŸ–₯️Part 8 β€” Distributed Training
Ch 31
Distributed Training
No single GPU can train GPT β€” you need thousands working in concert
πŸ”€Part 9 β€” Mixture of Experts
⚑🎯
Ch 33
Mixture of Experts (MoE)
Why DeepSeek can compete with 10Γ— less compute
πŸ’­Part 10 β€” Reasoning Models
🎯
Ch 34
Reasoning Models
Trading latency for accuracy β€” think before you answer
πŸ€–Part 11 β€” Agents, RAG & MCP
Ch 35
RAG β€” Retrieval-Augmented Generation
Giving models access to knowledge beyond their training cutoff
Ch 36
MCP β€” Model Context Protocol
The USB-C for AI β€” standardizing how models connect to the world
Ch 37
Agent Systems
From answering questions to taking autonomous action
πŸš€Part 12 β€” Building Frontier AI
Ch 38
The Competitive Landscape
Who the players are and what advantages they hold in 2026
Ch 39
The Minimum Frontier Stack
What you actually need to compete at the frontier
Ch 40
Real Competitive Moats
Not transformers anymore β€” everyone has transformers
Ch 41
The Frontier Thesis
Four stages of AI evolution β€” we are in Stage 2β†’3 transition