Request too large for gpt-4o : rate_limit_exceeded

I am trying to use ChatGPT as the chatbot for my Magento 2 website, and I want to pass product data to it. To do this, I collected all the products and stored them in a JSON file, which I then read to embed the data in the systemRoleContent of the system role. However, the issue I am facing is that the JSON file is quite large.

{
    "bot_response": "Error: ChatBot Error: Unexpected API response structure: {\n    \"error\": {\n        \"message\": \"Request too large for gpt-4o on tokens per min (TPM): Limit 30000, Requested 501140. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.\",\n        \"type\": \"tokens\",\n        \"param\": null,\n        \"code\": \"rate_limit_exceeded\"\n    }\n}\n"
}

I noticed that there is a function that needs to be added to the API configuration, which allows you to run a query to select products based on keywords found in their names or descriptions that match keywords in the user’s message. The challenge is that users initially may not know the names of the products; they come to the chatbot to discover them. How can I address this issue?

This is the code that I am working with right now:

<?php

namespace MetaCares\Chatbot\Model;

use Magento\Framework\App\ObjectManager;

class ChatBot
{
    private $authorization;
    private $endpoint;
    private $conversationHistory = [];
    private $productsFile;
    private $fetchingDateFile;
    private $didFetchProducts = false;

    public function __construct()
    {
        $this->authorization = 'sk-proj-';
        $this->endpoint = 'https://api.openai.com/v1/chat/completions';

        $this->productsFile = __DIR__ . '/products.json';
        $this->fetchingDateFile = __DIR__ . '/fetching_date.json';

        $currentTime = time();
        $timeDifferenceSeconds = 24 * 3600;

        if (!file_exists($this->fetchingDateFile)) {
            file_put_contents($this->fetchingDateFile, json_encode(['last_fetch_time' => 0]));
        }

        $fetchingData = json_decode(file_get_contents($this->fetchingDateFile), true);
        $lastFetchTime = $fetchingData['last_fetch_time'] ?? 0;

        if ($currentTime - $lastFetchTime > $timeDifferenceSeconds) {
            $products = $this->fetchProductsUsingModel();
            $productsJson = json_encode($products);
            file_put_contents($this->productsFile, $productsJson);
            $fetchingData['last_fetch_time'] = $currentTime;
            file_put_contents($this->fetchingDateFile, json_encode($fetchingData));
            $this->didFetchProducts = true;
        }

        $jsonSampleData = file_get_contents($this->productsFile);

        $systemRoleContent = <<<EOT
Nom:
    Meta Cares Bot
Description
    BOT Meta Cares répond aux questions sur les produits du site et fournit des conseils santé fiables.
    Tu aides les clients de Meta Cares à faire des choix éclairés tout en offrant un accompagnement personnalisé, sécurisé et adapté à leurs besoins.

catalogue Meta Cares
{$jsonSampleData}

Liste des Sites Référencés :
- PubMed : [https://pubmed.ncbi.nlm.nih.gov/](https://pubmed.ncbi.nlm.nih.gov/)
- ScienceDirect : [https://www.sciencedirect.com/](https://www.sciencedirect.com/)
---
- Génération d’images DALL·E : Désactivée

EOT;

        $this->conversationHistory[] = [
            'role' => 'system',
            'content' => $systemRoleContent
        ];

        if (session_status() == PHP_SESSION_NONE) {
            session_start();
        }
        if (isset($_SESSION['chat_history'])) {
            $this->conversationHistory = $_SESSION['chat_history'];
        }
    }

    public function fetchProductsUsingModel(): array
    {
        return $products;
    }

    private function getCategoryNames(array $categoryIds): array
    {
        return $categoryNames;
    }

    public function sendMessage(string $message): array
    {
        try {
            $this->conversationHistory[] = [
                'role' => 'user',
                'content' => $message
            ];

            $data = [
                'model' => 'gpt-4o',
                'messages' => array_map(function ($msg) {
                    return [
                        'role' => $msg['role'] === 'bot' ? 'assistant' : $msg['role'],
                        'content' => $msg['content']
                    ];
                }, $this->conversationHistory)
            ];

            $response = $this->makeApiRequest($data);
            $arrResult = json_decode($response, true);

            if (json_last_error() !== JSON_ERROR_NONE) {
                throw new \Exception('Invalid API response format');
            }

            if (!isset($arrResult['choices']) || !isset($arrResult['choices'][0]['message']['content'])) {
                throw new \Exception('Unexpected API response structure: ' . $response);
            }

            $assistantResponse = $arrResult['choices'][0]['message']['content'];
            $this->conversationHistory[] = [
                'role' => 'bot',
                'content' => $assistantResponse
            ];

            $_SESSION['chat_history'] = $this->conversationHistory;
            return [
                "conversationHistory" => $_SESSION['chat_history'],
                'didFetchProducts' => $this->didFetchProducts,
                'response' => $assistantResponse,
            ];
        } catch (\Exception $e) {
            throw new \Exception('ChatBot Error: ' . $e->getMessage());
        }
    }

    private function makeApiRequest(array $data): string
    {
        $ch = curl_init();
        curl_setopt_array($ch, [
            CURLOPT_URL => $this->endpoint,
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($data),
            CURLOPT_HTTPHEADER => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->authorization,
            ],
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_SSL_VERIFYPEER => false,
            CURLOPT_SSL_VERIFYHOST => 0
        ]);

        $response = curl_exec($ch);

        if (curl_errno($ch)) {
            $error = curl_error($ch);
            curl_close($ch);
            throw new \Exception('API request failed: ' . $error);
        }

        curl_close($ch);
        return $response;
    }
}

A hard-coded API key is bad news. For that matter, so is PHP. DDoS site server: receive free API key.

Whoa!

You’ll need to use some semantic search, where the user input “7 blade razor refills” can be matched with {“product”: “magno-7-extreme”, “description”: “Our premium seven-bladed razor, taking magno-7 refills”}

You can look at the assistants endpoint, and its “file_search” for splitting up text files and providing AI an internal search tool (will deny JSON if you don’t put some text preamble in a txt file). However, it is just as complicated to call in multiple required steps as it would be to implement retrieval search tools yourself in real backend code.

Or, if you need ELI5. You’re charged by amount of text send/received, there are also limits put on this. instead sending huge amounts of text, look into RAGs/vectors. For example, create embeddings with ada on relevant topic, upload them to pinecone. then you create embeddings of the text you want to refer, ask pinecone for top_k (# similiar results) and you send these final query to chatgpt. gpt-4o model can help you with this easy

(post deleted by author)

@kamil00
ok, I searched about it, so I’ll need to hange eveything in my code ?
can I send the data from my website db to the vector db by code or I need to do that manually ?
I don’t have any idea about this RAGs/vectors to be honest

You upload your data to a vector database, such as Pinecone. For example, you generate embeddings for your data using OpenAI’s Ada model and then upload the embeddings online. When a user visits your website and searches for a topic—say, RAGs—you take their query and create an embedding for it (using the same OpenAI model). This query embedding represents the user’s question as a numerical vector.

Next, you query Pinecone (or another vector database) to retrieve the top_k most similar results, such as the top 2. Pinecone responds with the most relevant text from your dataset. Finally, you can use ChatGPT to generate a response to the user’s question, referencing the retrieved text as supporting information.

So if you wanted ask about vectors, you’d upload openai faq to pinecone, then create embedding to your question, find most similar embeddings on pinecone and then you send Q to gpt-4o.