Huge difference between ChatGPT assistant and API assistant

I have used ChatGPT for prototyping an assistant that can take a user natural language prompt and translate it into URL filters.
It works very nicely as a GPT in the Chat interface, but setting it up in Playground, with identical instructions and same document to retrieve information about URL parameters, produces responses that are wrong and vastly slower than the mockup in ChatGPT.

In playground, it does not seem to check the document for instructions and just makes guesses about the URL structure.

Are there any key differences between assistants in chat and assistants in playground, that should be taken into account, when setting them up?
Or are GPU resources just throttled for API assistants atm?

2 Likes

I am also facing the same issue wherein I am getting different accuracy from Assistant API than ChatGPT interface for the same task with same file & instructions. Anyone has any idea why this difference is there?

It is not an issue. It is expected.

ChatGPT is a realized product, with its own model, its own internal prompting, and its own non-replicable tools.

The API is a development platform.

You won’t have your own product invoking a special tool injection to then tell people where to vote, for example, but ChatGPT does. You can’t receive custom “tool” output (only functions) and you haven’t written tons of anti-infringement text for the AI, but that’s what DALL-E in ChatGPT does.

Files work differently, vector store search parameters are proprietary, some text and filenames is extracted and placed into the ChatGPT context automatically.

Different input means different output.

You have a chatgpt-4o-latest for experimentation on the API, without tools or other capabilities. You can place all the input context text you can extract from ChatGPT and just place it as fake system message descriptions that cannot be invoked. Then you still don’t know the sampling parameters or what OpenAI does dynamically to the real ChatGPT efficiencies.

This is no concern for those who aren’t trying to copy someone else’s $1B revenue product and offering no value addition.