How to accelerate PHP Legacy Assistant (Vector)

Hello everyone,

I’ve been trying for 4 months to build a chatbot assistant based on ChatGPT, using legacy PHP, as I’m not comfortable with Node.js, REST APIs, and related tools.

I’ve added credits to my account, created an assistant, set up a vector store, and uploaded 4 or 5 JSON files containing products and news content.

I previously posted a message here but never received any replies, so I kept digging and working on it for weeks.

At this point, the assistant correctly uses the tools and functions I defined in the dashboard, and it’s able to extract relevant information from the JSON files.

However, I’m still facing two issues:

  1. Main problem: response time is painfully slow — around 10 to 20 seconds per reply. In the OpenAI sandbox, responses are much faster (about 2 to 4 seconds). What could be causing this slowdown? Is there something I’ve done wrong?
  2. Secondary issue: I still haven’t managed to get streaming responses to work properly.

If anyone has a few minutes to help, I’d be extremely grateful. After four months of effort and research, I’m starting to feel quite discouraged.

Thanks in advance!

My php :

<?php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');
header('Access-Control-Allow-Origin: *');
header('Access-Control-Allow-Methods: POST');
header('Access-Control-Allow-Headers: Content-Type');

	if($_SERVER['REQUEST_METHOD'] === 'OPTIONS') {
		exit(0);
	}

	define('OPENAI_API_KEY', 'xxx');
	define('ASSISTANT_ID',	 'xxx');

	class testChatBot {
		private $apiKey;
		private $assistantId;
		private $vectorStoreId = 'xxx';

		public function __construct() {
			$this->apiKey		 = OPENAI_API_KEY;
			$this->assistantId	 = ASSISTANT_ID;
		}

		public function processMessage($userMessage, $sessionId, $existingThreadId = null) {
			try {
				$threadId = $existingThreadId;

				if(!$threadId) {
					echo "data: " . json_encode(['choices' => [['delta' => ['content' => "🔄 Creation thread..."]]]]). "\n\n";
					flush();

					$threadId = $this->createThread();

					echo "data: " . json_encode(['threadId' => $threadId]) . "\n\n";
					flush();

					echo "data: " . json_encode(['choices' => [['delta' => ['content' => " ✅\n"]]]]). "\n\n";
					flush();
				} else {
					echo "data: " . json_encode(['choices' => [['delta' => ['content' => "🔄 Using existant thread... ✅\n"]]]]). "\n\n";
					flush();
				}

				echo "data: " . json_encode(['choices' => [['delta' => ['content' => "🔄 Adding message..."]]]]). "\n\n";
				flush();

				$this->addMessage($threadId, $userMessage);

				echo "data: " . json_encode(['choices' => [['delta' => ['content' => " ✅\n🔄 Lauching assistant...\n\n"]]]]). "\n\n";
				flush();

				$this->runAssistantWithStreaming($threadId);
			} catch (Exception $e) {
				echo "data: " . json_encode(['choices' => [['delta' => ['content' => "❌ Error: " . $e->getMessage()]]]]). "\n\n";
				flush();
			}
		}

		private function createThread() {
			$response = $this->makeRequest('POST', 'https://api.openai.com/v1/threads', [
				'tool_resources' => [
					'file_search' => [
						'vector_store_ids' => [$this->vectorStoreId]
					]
				]
			]);

			return $response['id'];
		}

		private function addMessage($threadId, $message) {
			$this->makeRequest('POST', "https://api.openai.com/v1/threads/{$threadId}/messages", [
				'role' => 'user',
				'content' => $message
			]);
		}

		private function runAssistantWithStreaming($threadId) {
			$url = "https://api.openai.com/v1/threads/{$threadId}/runs";

			$postData = json_encode([
				'assistant_id'	 => $this->assistantId,
				'stream'		 => true
			]);

			$ch = curl_init();

			curl_setopt_array($ch, [
				CURLOPT_URL				 => $url,
				CURLOPT_RETURNTRANSFER	 => false,
				CURLOPT_POST			 => true,
				CURLOPT_POSTFIELDS		 => $postData,
				CURLOPT_HTTPHEADER		 => [
					'Authorization: Bearer ' . $this->apiKey,
					'Content-Type: application/json',
					'OpenAI-Beta: assistants=v2'
				],
				CURLOPT_WRITEFUNCTION	 => [$this, 'handleStreamData'],
				CURLOPT_SSL_VERIFYPEER	 => false,
				CURLOPT_TIMEOUT			 => 60,
				CURLOPT_CONNECTTIMEOUT	 => 10
			]);

			$result = curl_exec($ch);

			if(curl_error($ch)) {
				echo "data: " . json_encode(['choices' => [['delta' => ['content' => "❌ Error curl: " . curl_error($ch)]]]]). "\n\n";
				flush();
			}

			curl_close($ch);
		}

		public function handleStreamData($ch, $data) {
			$lines = explode("\n", $data);

			foreach($lines as $line) {
				$line = trim($line);

				if(strpos($line, 'data: ') === 0) {
					$jsonData = substr($line, 6);

					if($jsonData === '[DONE]') {
						echo "data: [DONE]\n\n";
						flush();
						continue;
					}

					try {
						$parsed = json_decode($jsonData, true);

						if($parsed && isset($parsed['object'])) {
							if($parsed['object'] === 'thread.message.delta') {
								if(isset($parsed['delta']['content'][0]['text']['value'])) {
									$content = $parsed['delta']['content'][0]['text']['value'];
									echo "data: " . json_encode(['choices' => [['delta' => ['content' => $content]]]]) . "\n\n";
									flush();
								}
							}
						}
					} catch (Exception $e) {
						// IGNORE JSON ERROR
					}
				}
			}

			return strlen($data);
		}

		private function makeRequest($method, $url, $data = null) {
			$ch = curl_init();

			curl_setopt_array($ch, [
				CURLOPT_URL				 => $url,
				CURLOPT_RETURNTRANSFER	 => true,
				CURLOPT_HTTPHEADER		 => [
					'Authorization: Bearer ' . $this->apiKey,
					'Content-Type: application/json',
					'OpenAI-Beta: assistants=v2'
				],
				CURLOPT_SSL_VERIFYPEER	 => false
			]);

			if($method === 'POST' && $data) {
				curl_setopt($ch, CURLOPT_POST, true);
				curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
			}

			$response = curl_exec($ch);
			$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

			if(curl_error($ch)) {
				throw new Exception('Erreur cURL: ' . curl_error($ch));
			}

			if($httpCode >= 400) {
				throw new Exception('Erreur API: ' . $response);
			}

			curl_close($ch);

			return json_decode($response, true);
		}
	}

	if($_SERVER['REQUEST_METHOD'] === 'POST') {
		$input = json_decode(file_get_contents('php://input'), true);

		if(!isset($input['message'])) {
			echo "data: " . json_encode(['choices' => [['delta' => ['content' => "Missing message"]]]]) . "\n\n";
			exit;
		}

		$chatBot = new testChatBot();

		$chatBot->processMessage(
			$input['message'],
			$input['sessionId']	 ?? null,
			$input['threadId']	 ?? null
		);
	} else {
		http_response_code(405);
		echo json_encode(['error' => 'Wrong Méthod']);
	}
?>

And my JS/HTML :

    <div class="chat-container">
        <div class="chat-header">
            <h1>Assistant</h1>
            <button class="reset-button" onclick="resetSession()">New session</button>
        </div>
        
        <div class="session-info" id="sessionInfo">Active session</div>
        
        <div class="chat-messages" id="chatMessages">
            <div class="message assistant">
                <div class="message-content">
                    Hello
                </div>
            </div>
        </div>
        
        <div class="chat-input">
            <textarea id="messageInput" class="input-field" placeholder="Ask your question..."></textarea>
            <button id="sendButton" class="send-button">➤</button>
        </div>
    </div>

<script>
	const chatMessages	 = document.getElementById('chatMessages');
	const messageInput	 = document.getElementById('messageInput');
	const sendButton	 = document.getElementById('sendButton');
	const sessionInfo	 = document.getElementById('sessionInfo');
	
	let isStreaming		 = false;
	let currentThreadId	 = null;

	function generateSessionId() {
		return 'session_' + Math.random().toString(36).substr(2, 9) + '_' + Date.now();
	}

	let sessionId = sessionStorage.getItem('testSessionId');

	if(!sessionId) {
		sessionId = generateSessionId();

		sessionStorage.setItem('testSessionId', sessionId);
	}

	currentThreadId = sessionStorage.getItem('testThreadId');

	sessionInfo.textContent = `Session: ${sessionId.substr(-8)} ${currentThreadId ? '(Thread actif)' : '(Nouveau thread)'}`;

	messageInput.addEventListener('keypress', function(e) {
		if (e.key === 'Enter' && !e.shiftKey) {
			e.preventDefault();
			sendMessage();
		}
	});

	sendButton.addEventListener('click', sendMessage);

	function resetSession() {
		sessionStorage.removeItem('testSessionId');
		sessionStorage.removeItem('testThreadId');

		location.reload();
	}

	function addMessage(content, isUser = false) {
		const messageDiv		 = document.createElement('div');
		messageDiv.className	 = `message ${isUser ? 'user' : 'assistant'}`;
		const contentDiv		 = document.createElement('div');
		contentDiv.className	 = 'message-content';
		contentDiv.innerHTML	 = content;

		messageDiv.appendChild(contentDiv);
		chatMessages.appendChild(messageDiv);

		chatMessages.scrollTop	 = chatMessages.scrollHeight;

		return contentDiv;
	}

	function showTypingIndicator() {
		const typingDiv		 = document.createElement('div');

		typingDiv.className	 = 'message assistant';
		typingDiv.id		 = 'typing-indicator';

		const typingContent	 = document.createElement('div');

		typingContent.className		 = 'typing-indicator';
		typingContent.style.display	 = 'flex';
		typingContent.innerHTML		 = `<div class="typing-dots"><span></span><span></span><span></span></div>`;

		typingDiv.appendChild(typingContent);
		chatMessages.appendChild(typingDiv);

		chatMessages.scrollTop		 = chatMessages.scrollHeight;
	}

	function hideTypingIndicator() {
		const typingIndicator = document.getElementById('typing-indicator');

		if(typingIndicator) {
			typingIndicator.remove();
		}
	}

	async function sendMessage() {
		const message = messageInput.value.trim();

		if(!message || isStreaming) return;

		addMessage(message, true);
		messageInput.value	 = '';
		isStreaming			 = true;
		sendButton.disabled	 = true;

		showTypingIndicator();

		try {
			const response = await fetch('chat.php', {
				method: 'POST',
				headers: {
					'Content-Type': 'application/json',
				},
				body: JSON.stringify({ 
					message:	 message,
					sessionId:	 sessionId,
					threadId:	 currentThreadId
				})
			});

			hideTypingIndicator();

			if(!response.ok) {
				throw new Error('Erreur réseau');
			}

			const reader		 = response.body.getReader();
			const decoder		 = new TextDecoder();
			const responseDiv	 = addMessage('');
			let fullResponse	 = '';

			while(true) {
				const { done, value } = await reader.read();

				if(done) break;

				const chunk = decoder.decode(value, { stream: true });
				const lines = chunk.split('\n');

				for(const line of lines) {
					if(line.startsWith('data: ')) {
						const data = line.slice(6);
				
						if(data === '[DONE]') continue;

						try {
							const parsed = JSON.parse(data);

							if(parsed.threadId && !currentThreadId) {
								currentThreadId = parsed.threadId;
								sessionStorage.setItem('testThreadId', currentThreadId);
								sessionInfo.textContent = `Session: ${sessionId.substr(-8)} (Thread actif)`;
							}

							if(parsed.choices && parsed.choices[0].delta.content) {
								fullResponse += parsed.choices[0].delta.content;
								responseDiv.innerHTML = formatMessage(fullResponse);
								chatMessages.scrollTop = chatMessages.scrollHeight;
							}
						} catch (e) {
							console.log('Ligne ignorée:', line);
						}
					}
				}
			}
		} catch (error) {
					hideTypingIndicator();
					addMessage('Désolé, une erreur est survenue. Veuillez réessayer.');
					console.error('Erreur:', error);
		} finally {
					isStreaming = false;
					sendButton.disabled = false;
					messageInput.focus();
		}
	}

	function formatMessage(text) {
		return text
			.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
			.replace(/\*(.*?)\*/g, '<em>$1</em>')
			.replace(/\n/g, '<br>');
	}
</script>

Well I think the location of your server might interfere significantly depending on your location. Some providers allow you to choose the location of where your app is hosted, that might help reduce latency.

Based on your feedback from the Assistants API beta, we’ve incorporated key improvements into the Responses API. After we achieve full feature parity, we will announce a deprecation plan later this year, with a target sunset date in the first half of 2026. Learn more.

One other thing to notice, is that assistants API is being deprecated, I wouldn’t start a new project with it right now. Responses and Completions API are not only recommended, they probably have more priority in server resources, network optimization, bug fixes, new features, etc.

Perhaps it is something worth considering before investing time on your new streaming feature, considering you will soon have to make adjustments for your endpoint.

Eventual overloads in demand for the API happens sometimes, perhaps you might need to run more extensive tests to determine if it is a continuous problem or intermittent.

1 Like