Response of gpt-4-turbo is taking more time

I have created a chatbot using assistant api having retrieval and function as a tool (model - gpt-4-turbo) , but the response is taking more than 30 seconds which is not feasible to be used in production.

To avoid this, i downgraded to gpt-3.5-turbo-1106 but the response is horrible.

How to tackle this issue, any suggestions please?

1 Like

The solution is: do NOT use Assistants API in production. They are in beta and nowhere close to be production ready. Yet.

3 Likes

The solution is: do NOT use Assistants API in production. They are in beta and nowhere close to be production ready. Yet.

It’s actually kind of hilarious how quickly we go from “This is the greatest innovation ever made, it will fundamentally change how we work and interact with the world.” straight into “This thing is broken and too dang slow. OpenAI is ripping off their customers.”

I’m not sure who you mean by “we”. If you want my opinion that didn’t change much over the last couple of years: we are on the verge of tremendous disruption of everything and everyone. That doesn’t mean the specific/overall tech is there yet. What we have is ALREADY doing something we would consider magic just a few years back. But now “we” (whoever this is) are taking it for granted.

1 Like

I completely agree. When I said “we” I was referring to the general public/users. People who get GPT Plus and then complain because it can’t read their minds and instantly know what to do.

If everything stopped and no new models were developed the current tools would still completely change the world. I will admit, I wasn’t paying much attention until ChatGPT came out. That was the thing that opened my eyes to the possibilities.

I was more trying to point out how quickly people go from awestruck to dissatisfied. Human nature is a funny thing.

1 Like

I wouldn’t be too harsh here :slight_smile: Aren’t you you also stressed out after working on some prompt or LLM function for a few hours only to realize it doesn’t provide stable/fast results? I am :slight_smile:

1 Like

Well… yes. I have one that is meant to take the audio transcription from whisper and convert it into JSON that can be reviewed by a human and then directly uploaded to an SQL database.

I’m struggling a bit with getting it to consistently choose the supplier and destination from a provided list. This is something I’m making to augment a PO system we just put in place, but it would be the first large-scale AI implementation in our company. No pressure.

How about yourself? What are you working on?

We can discuss this in DM, as this would be off topic. My point was: do you still get disappointed in the technology in such cases? I doubt :slight_smile: We are very early and such things happen maybe too often than we wished they did, but this is a standard technology infancy stage (the only difference is the speed).

Those aren’t mutually exclusive.

1 Like

Yep, you get kicked in the teeth a lot, but what is possible and the tools I have built and now use regularly blow my mind. “Not in my wildest dreams” stuff.

Stabillity and consistency will come.

1 Like