Course

Course lectures and talks based on the RLHF Book, built with Colloquium. Click into a deck to navigate through the slides, or open in full screen.

Lectures

Lecture 1: Overview

Chapters 1-3 · Foundations of RLHF and post-training

Watch PDF Slides Source

Lecture 2: IFT, Reward Models, & Rejection Sampling

Chapters 4, 5, 9 · Beginning the core optimization methods section

Watch PDF Slides Source

Lecture 3: RL Motivation & Math

Chapter 6, Part 1 · Policy gradients math, intuitions, and theory

PDF Slides Source

Lecture 4: RL Implementation & Practice

Chapter 6, Part 2 · Code, loss aggregation, async training, and practical engineering

PDF Slides Source

Other Lectures

2026

An Introduction to Reinforcement Learning from Human Feedback and Post-training

SALA 2026 · Quito, Ecuador · March 2026

Invited Talk PDF Full Screen Source

Citation

If you found this useful for your research, please cite it!

@book{rlhf2026lambert,
  author = {Nathan Lambert},
  title = {Reinforcement Learning from Human Feedback},
  year = {2026},
  publisher = {Online},
  url = {https://rlhfbook.com}
}

Welcome to the Course