You can use this client implemented with asyncio and httpx.
It supports fine grained connect/read timeout setting and connection reuse.
from httpx import Timeout
from openai_async_client import AsyncCreate, Message, ChatCompletionRequest, SystemMessage, OpenAIParams
create = AsyncCreate(api_key=os.environ["OPENAI_API_KEY"])
messages = [
Message(
role="user",
content=f"ChatGPT, Give a brief overview of the Pride and Prejudice by Jane Austen.",
)
]
response = create.completion(ChatCompletionRequest(prompt=messages),client_timeout=Timeout(1.0,read=10.0),retries=3)
create = AsyncCreate()
response = create.completion(TextCompletionRequest(prompt=f"DaVinci, Give a brief overview of Moby Dick by Herman Melville."))
Iād just like to add that a retry / backoff library is also a great option. In the event that a timeout, or some sort of intermittent error occurs, it will automatically retry using dynamic intervals.
Yes, request_timeout is very important. Most people tell how to write retry decorator, but it can not solve the problem of SLOW. Set this parameter will help a lot.
Meanwhile, use other Parallel method will help when you do not reach you RPM of your account, Like pool.apply_async().
In conclusion, retry decorator+request_timeout parameter+parallel method will accelerate your chatgpt application.