In the past OAI have called for examples of ‘degraded performance’ or usefulness, and in the past I’ve shared one of the benchmark questions for the ExamenChat plugin GPT-4 has been severely downgraded (topic curation) - #214 by yhavinga
Today I’d like to share two new datapoints:
- On Nov 8 the conversation as a plugin https://chat.openai.com/share/a599402b-ca0d-4de5-b317-d555752a9ee3
- On Nov 14 the conversation through a gpt with action https://chat.openai.com/share/77d4cd3f-367c-464d-a39a-0640949b0fe6
TL;DR: highly disparate experience! The plugin conversation was one of the most helpful and complete answers I’ve ever seen, and the GPTs action conversation one of the worst.
Quality of the last message:
The plugin version shows how to perform the correct calculation, identifies in my calculation what I’ve missed, and then proceeds to explain what I should have done in my calculation to arrive at the correct answer. Since my calculation used a different formula, I am very impressed that the model was able to pinpoint the omission in my otherwise correct method. I regard this as the same quality as a real tutor would do.
The gpt action version… where to start. It shows only part of the official answer: only the resulting numbers, but not the formula(s) used. Then it compares with my result on numbers only, they differ, and then proceeds to state that there appears to be a mistake in my calculation, because the final answer differs. For a student this is a waste of time, as it is stating the obvious.
Installing plugins may be difficult for the average user, I cannot judge this. But once installed, enable one or two plugins and your basically good to go and using the plugin works like a charm. Also, it is possible to enable multiple plugins in a conversaion. Normally I use ExamenChat and Wolfram enabled together, so I can just ask ‘calculate … with wolfram’
Installing the gpt action version is perhaps easier in some parts, but annoying in others. Easy: just open the shared GPT and you can start. No installing a plugin etc.
Also the conversation-starters are a big plus!!
But now the annoying part starts. Only after I ask my first question, I must login first. After successful login, the conversation could go on answering my question, but this doesn’t happen. I suspect that at this point, users will be a bit puzzled/annoyed. I hope this is ironed out in future updates. Hitting the refresh message button solves this, and the conversation proceeds as expected. But then… since I’ve hit ‘refresh’, when the message is finished, I am asked if this response was better.
Next question. Then I am asked if I am okay with this:
And a question like this keeps popping up, and is hindering a seamless user experience. It is good to protect users, but isn’t there a way to have plugins/actions verified to be safe? Or have users present with an option ‘do not ask this again for this plugin?’
With respect to the speed, today is a bit slow.
For the curious, the GPT Action variant is available on this link https://chat.openai.com/g/g-x6TDvxscV-examenchat