Implementing our own RLHF?

Hi, has anyone implemented its own RLFH process? Wondering whether to

  • build a new model with the annotated human verified answers AND
  • ask ChatGPT to generate multiple responses (with explanations) AND
  • pick the one that gets the better score from the RLFH model

or whether I should just build an embedding DB with the annotated data and enrich the prompt.

Anyone having tried this?


Hi @dominique.lahaix

I think you will find this recent MS Build session by Andrej Karpathy helpful.


Just FYI, it’s RLHF.

Using the correct acronym will help ensure you get better results.


Oops - should use ChatGPT for typos :wink: