ChatGPT, OpenAI’s newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm!
Sponsor: Weights & Biases.
https://wandb.me/yannic.
OUTLINE:
0:00 — Intro.
0:40 — Sponsor: Weights & Biases.
3:20 — ChatGPT: How does it work?
5:20 — Reinforcement Learning from Human Feedback.
7:10 — ChatGPT Origins: The GPT-3.5 Series.
8:20 — OpenAI’s strategy: Iterative Refinement.
9:10 — ChatGPT’s amazing capabilities.
14:10 — Internals: What we know so far.
16:10 — Building a virtual machine in ChatGPT’s imagination (insane)
20:15 — Jailbreaks: Circumventing the safety mechanisms.
29:25 — How OpenAI sees the future.
References: