OpenAI Chat Intialization Latency. Is there a way to initialize once and reusing the object like socket?

developerp248 · February 25, 2025, 12:41pm

When we try to connect with OpenAi using chat.completions.create, we have to create this with every response.

const response: any = await openAi.chat.completions.create({
		model: OPENAI_MODEL,
		store: false,
		messages: conversationHistory,
		stream: true,
		stream_options: { include_usage: true },
	});

This line alone takes anywhere between 500-800 ms which is huge in real time conversation usecases.

Is there a way to initialize response once and to reuse this, hence reducing the latency of OpenAI Chat Initialization?

Something like a socket :

const response: any = await openAi.chat.completions.create({
		model: OPENAI_MODEL,
		store: false,
		messages: conversationHistory,
		stream: true,
		stream_options: { include_usage: true },
	});

	response.send(conversationHistory);

Anyone else has a solution to this?

Topic		Replies	Views
Speeding up Python API calls? API	3	2465	December 25, 2023
How can I improve response times from the OpenAI API while generating responses based on our knowledge base? API chatgpt , api	3	22553	November 9, 2023
OpenAI API response time delay API gpt-4 , chatgpt , api	0	100	May 7, 2025
ChatGPT API Very Slow at generating Responses API gpt-4 , api	8	5458	December 25, 2023
Speeding up the response from the openai's assistant api API gpt-4 , assistants-api	2	2242	July 17, 2024

OpenAI Chat Intialization Latency. Is there a way to initialize once and reusing the object like socket?

Related topics