Retrieval with GPT3.5Turbo

Hello everyone, I’m getting a lot of issues since 2 days creating assistants with gpt3.5Turbo.
If i try to retrieval, (small documents) with this model, the run always fail.
I have a very basic simple prompt, to retrieval infos in case user ask for a topic.
Then if i switch and use gpt4, gives no errors, and retrieval operation go successful.
In your opinion is the 3.5turbo model that isn’t capable? I mean I don’t think so, beacuse i always used it, even for larger context, with nice output,… maybe some issues because Assistant api is in Beta? Thanks in advance if someone can help!

1 Like

I’m eager to see replies to this. I always use GPT4 Turbo for the assistant API, which with longer messages can get extremely expensive (around $2 USD per conversation in my use case).

I suspect that you will need to improve the file format to reduce the size of the context. This will help the GPT3.5 Turbo be more precise as it has less wiggle room to make a mistake.

Thanks Diego for the reply, yes, as you said, gpt4 Turbo is not sustainable and extremely expensive for retrieval assistants, that’s why i must use 3.5turbo.
And with this model I create a lot simpler prompts and i do not work with large context or files, just 1/2 pages of basic infos in .txt , which is fine for my usecase.
But the problem is that simply the run gets error and return a failed run, so I don’t know, maybe it’s a current scalabilty problem for this API, and we have to wait until this Beta phase ends. And wondering if there is a timeline for this

I have been using testing retrieval outputs between GPT 3.5 & 4 and have found significantly more accurate responses using 4. When 4 came out it was noted to be much more capable of following multistep instructions than 3.5.

retrieval error
The issue is that’s what happens when the assistant is expected to retrieval with gpt3.5turbo, and the error code in the logs is “server error”

Just a heads up to anyone using API retrieval: there’s a new alternative that can be more affordable.

Pinecone is a much easier alternative with lots of great, user-friendly documentation, and just came out with their Serverless option that only costs $0.33/GB/month (Plus negligible limits counted in the millions)

(I am not affiliated)