I am using gpt-3.5-turbo model and trying to get feedback score for given conversation. This is my request body:
{“model”:“gpt-3.5-turbo”,“messages”:[{“role”:“system”,“content”:“You will be provided with meeting notes, and your task is to summarize the meeting and give feedback score between 1 and 10.”},{“role”:“user”,“content”:“…”}]}
It works fine and retrieves particular feedback score with explanation text but i want to catch exact feedback score each time and save it to db. After a certain period of time i want to find average for it. How can i be sure that each time it will retrieve feedback score properly? Because when i send request with same request body it returns different feedback scores(f.e 8,6,7). Thank you for your answers
Your idea of “feedback score” doesn’t mean much to me, and the AI likely won’t know how to provide such a score.
Because of that, you basically get a roll of the dice because of the impossibility to determine a number.
In using the API, you should set a temperature to 0.1 if you want a deterministic result, even if the number is rather meaningless.
To improve the quality, you should describe the evaluation criteria.
“Rank the meeting notes I have provided on a scale of 1-10 in the areas of [clarity, language quality, spelling, grammar] in a table.”
When I substitute your own post as the data that prompt will evaluate:
To rank the provided text on a scale of 1-10 in the areas of clarity, language quality, spelling, and grammar, we’ll assess each aspect individually:
Aspect Rank (1-10) Comments Clarity 7 The text conveys its main points effectively and provides clear instructions and context. However, it could be slightly improved with more explicit details and formatting. Language Quality 8 The language used is generally proficient and coherent. It effectively communicates the user’s intentions and concerns. Some minor improvements could be made to enhance the overall fluency. Spelling 9 No spelling errors were identified in the text. The spelling is accurate and error-free. Grammar 8 The grammar is mostly correct, with well-formed sentences and proper sentence structure. There are a few areas where the phrasing could be refined to improve the grammar slightly. Overall, the text demonstrates good clarity, language quality, spelling, and grammar. It effectively communicates the user’s purpose and concerns related to using the GPT-3.5-turbo model to obtain feedback scores for conversations. With minor improvements, the text could achieve an even higher ranking in all areas.