Impact of RLHF on truthworthyness

stricker · September 14, 2023, 9:37am

To systematically assess the impact of RLHF on the truthworthyness of GPT’s answers it is mandatory to have access to the underlying pretrained-only models. Do I understand it correctly that all models that can be used over the API are fine-tuned by RLHF? So this kind of research is not possible? Which workarounds are at hand?

Topic		Replies	Views
Prime Zero Shot Learners towards Factfulness - GPT3 vs. GPT-J-6B Community	3	623	June 23, 2021
ChatGPT 3.5 was not trained with RLHF Community chatgpt	1	2262	November 20, 2023
GPT-4 System Card by OpenAI - March 15, 2023 Community	3	5476	December 17, 2023
Just got access to GPT-4 but it responds like 3.5 API gpt-4	13	8216	July 8, 2023
Data cutoff date in 1106 models API gpt-4 , api	6	9696	April 12, 2024

Impact of RLHF on truthworthyness

Related topics