Find answers from the community

Updated 4 months ago

Tech question How can we do RLHF on top

At a glance
Tech question: How can we do RLHF on top of GPT4?

We are using GPT4 to make a chatbot that answers product-related questions on e-commerce websites. Regardless of how much we do prompt engineering, some GPT4 responses are superb while some have scope of improvement. We would love to use RLHF or similar technique to teach GPT4 from the human feedback on the quality of responses.

Additional info: We are already using finetuned models. We would ideally like to make this human feedback as a continous learning machine
L
1 comment
I think only gpt-3.5 is able to be finetuned right now. And it's not RLHF, just regular completion training
Add a reply
Sign up and join the conversation on Discord