Deep Dive into LLMs like ChatGPT | Andrej Karpathy · 260429
Loading...
Click any topic card to view the full transcript for that segment.
Andrej walks through the entire pipeline of building an LLM: downloading the internet, tokenization, pre-training, and the key stages of creating something like ChatGPT.
Detailed walkthrough of the pre-training stage: data collection (FineWeb dataset), tokenization, the massive compute requirements, and what the model actually learns from internet-scale data.
How supervised fine-tuning and RLHF transform a base model into a helpful assistant. The difference between a model that predicts text and one that follows instructions.
Andrej discusses the cognitive and psychological implications of LLMs — what they tell us about human intelligence, memory, and reasoning. Are LLMs thinking or just pattern matching?
Practical advice on prompt engineering, understanding LLM limitations (hallucinations, sharp edges), and getting the most out of tools like ChatGPT.