Find answers from the community

n
nalin
Offline, last seen 4 months ago
Joined September 25, 2024
Tech question: How can we do RLHF on top of GPT4?

We are using GPT4 to make a chatbot that answers product-related questions on e-commerce websites. Regardless of how much we do prompt engineering, some GPT4 responses are superb while some have scope of improvement. We would love to use RLHF or similar technique to teach GPT4 from the human feedback on the quality of responses.

Additional info: We are already using finetuned models. We would ideally like to make this human feedback as a continous learning machine
1 comment
L