Reinforcement learning by feedback improvement suggestion (there is a significant error in the API)

Hey! Love your product :slight_smile:

There is one thing that could be improved about the API, coming from a senior developer.

If my thoughts are correct, you are using user feedback on the response that comes after response regeneration to adjust the model.

However, there is one significant problem.

You ask for the feedback even in the cases when the response just failed and got regenerated due to network error. This doesn’t answer your needs!!!

There are two possible ways a person might understand the question about “Was this response better or worse?”:

  1. Was it better in terms of response time
  2. Was it better in terms of content

Given that you ask that question even in the case when response fails due to network issues, it forces the user to understand it in the first way. I think you don’t need to write this type of feedback in your DB, it produces significant bias…

You can do 2 things to alleviate this problem:

  1. Don’t ask about the new response quality after network fail regeneration
  2. You explicitly state that you are interested in the QUALITY OF CONTENT

Hope that helps :slight_smile:
I have other suggestions, hmu at if you’re interested. Hope this section is actually monitored…