We are constantly testing alternative models, e.g. our apps use local LLMs (on PC or on our cloud servers) for less demanding tasks. We are currently experimenting with the parameters (temperature, top_p) and make sure that max_tokens is higher than the tokens used in the AI response and try to optimize the prompt. But so far we have not been able to generate responses with the gpt-4.5 API that are qualitatively similar to those of the ChatGPT app in the o1 pro. Today, however, we had access to the GPT-4.5 model via the ChatGPT app for the first time and the results were comparable!
It seems that the app in o1 pro mode uses the backend very differently than we do (chain of thought, tree of thought, …???). The results are x times better than the pure API calls. Does anyone know more about how reasoning works in the o1 pro model of the app?
1 Like