Discussion about this post

User's avatar
Wei Ren's avatar

> To make the raw model safe and helpful, it undergoes supervised fine-tuning and reinforcement learning.

This does not belong to the inference path. Supervised fine-tuning and reinforcement learning are two post-training approaches.

Expand full comment

No posts