Well seeing no “official” response to the functionality regressing above, I don’t really get it.
- You guys had well working interface that was naturally capturing the fails of your voice recognition models allowing your users to edit the text before sending it to the conversation.
- You wanted to push it further to “save a second or 2” in the UI and removed the editing capability.
- After the complains about the removed functionality, you have added the setting to enable/disable the instant voice messaging in the app UI. Which was basically the ideal compromise between native speakers/general subject users and non-native speakers/tech jargon users. So that the first ones could still use the instant send while the latter ones could disable the feature and edit the messages before sending.
- But for whatever reason someone decided to remove the thing later on and literally degrade the app and UI by spending extra time doing so…
- Now the guys like me who are not a native english speaker (also using heavy tech jargon in chat interactions) get too frustrated to use the phone app because it’s just does not make sense to spend extra time correcting the speech recognition errors while those could be corrected before sending the message.
Was it to “limit” the usage of ChatGTP app because we spend way too much token on the plan?
I don’t think it will save tokens, actually the process is way more wasteful because we have to make more iterations to get what we need. Or like me, the users will use the web UI instead of the phone app.
So the results:
- OpenAI team spent extra time on developing/refactoring the app feature which was working nicely and compensated the weaknesses of voice-recognition making users work when needed.
- A fraction of the users gets more frustrated because the app does not work as before nor as expected (similar to web UI) and also they start paying more attention to the weak side of your voice-to-text models.
My question is simple: Why? Why all those iterations to get a result worse that before?
Additional question: why not leave the editing capability and use the user corrections to improve the voice recognition models? Investing some extra work in there would drastically improve the overall quality of the models and the app in the same time? Isn’t it more efficient?
P.S. had to tell that outloud.