Sometimes 1106 preview is slow

Hey guys! I’m trying to use Azure openai’s the latest model(1106-Preview), and the openai python version is 1.3.5, found sometimes the API is very slow (30-40s). May I know if need to set some parameter in chat method or it is the issue about LLM? Thanks. Looks like the old version (0613) worked fine and no this issue.

The model is not production ready. You might consider if OpenAI would actually made some of the interactions randomly extremely slow or failing just to prevent its use as a replacement yet.

The model also takes time to begin returning tokens often. With a new finish_reason of “content_filter” being available now, I expect the choice of sending to the moderations endpoint has been taken away from you — you now get “I’m sorry, I can’t do that” refusals unlikely to come from the AI model, unless they specifically trained it to be an a-hole. That could be another source of dependency and delay.

2 Likes

also sometimes gpt-3.5-turbo-1106 is very slow too, but gpt-3.5-turbo and gpt-3.5-turbo-16k-0613 is normal, anyone encountered this?

Got it. Thanks for your reply!