You can use this client implemented with asyncio and httpx.
It supports fine grained connect/read timeout setting and connection reuse.
from httpx import Timeout
from openai_async_client import AsyncCreate, Message, ChatCompletionRequest, SystemMessage, OpenAIParams
create = AsyncCreate(api_key=os.environ["OPENAI_API_KEY"])
messages = [
content=f"ChatGPT, Give a brief overview of the Pride and Prejudice by Jane Austen.",
response = create.completion(ChatCompletionRequest(prompt=messages),client_timeout=Timeout(1.0,read=10.0),retries=3)
create = AsyncCreate()
response = create.completion(TextCompletionRequest(prompt=f"DaVinci, Give a brief overview of Moby Dick by Herman Melville."))
I completely recommend granular timeouts, but in a general use-case it makes absolutely no sense. Especially in the context of an OpenAI request.
I’d say it’s like recommending sport car parts to someone who just wants to fix their Honda Civic (fantastic car btw).
The easiest way is to add parameter request_timeout, it will be pass to requests.post(timeout=xxx)
This is a great way.
I’d just like to add that a retry / backoff library is also a great option. In the event that a timeout, or some sort of intermittent error occurs, it will automatically retry using dynamic intervals.
This worked for me along with using Ronald’s suggestion to use the retry with it. Thanks!
Yes, request_timeout is very important. Most people tell how to write retry decorator, but it can not solve the problem of SLOW. Set this parameter will help a lot.
Meanwhile, use other Parallel method will help when you do not reach you RPM of your account, Like pool.apply_async().
In conclusion, retry decorator+request_timeout parameter+parallel method will accelerate your chatgpt application.
This worked perfect for me. Easy and straightforward.