Feeding history back into System input

Someone here suggested this to try to concatenation of previous prompt/responses back into the next question.

I was doing some cut and pasting and put a few into the System Input and found the model was remembered the context of the conversation.

I found the total input capabilities are2048 tokens all three inputs combined

So this will be interesting to see if my bot will get smarter and not need so much context in each follow up questions.

2048 may not be enough to train :steam_locomotive: in any depth but seems like this would be the place to do this on 3.5 models perhaps when OpenAI releases training we can get serious about a business like model

Yes :+1: seems to make a big difference try it out if I ask how far is the Moon :crescent_moon: then ask it how far the smart bot remembers I was talking about the moon

Without the concatenated system input the response is what ?

Totally works not sure how much I should use right now leaving just the last prompt response will add more if necessary bing must use all of the 2048 tokens perhaps why he must dump it often

See what you think http://arcticfoxltc.com/smart.php

I developed this Node.js script to save requests and response history to a JSON file and submit with each new request. It’s using GPT-4 model, but could be adapted to 3.5. Hope this helps.

// Import required modules
const fs = require('fs');
const axios = require('axios');

// Your OpenAI API key
const apiKey = 'your-openai-api-key';

// Function to interact with OpenAI API
async function interactWithAI(userPrompt) {
    try {
        // Define the message data structure
        let messageData = { 'messages': [] };

        // If requests.json exists, read and parse the file
        if (fs.existsSync('requests.json')) {
            let raw = fs.readFileSync('requests.json');
            messageData = JSON.parse(raw);
        }

        // Format the conversation history and the new user request
        let systemMessage = "Conversation history:\n" + messageData['messages'].map(m => `${m.role} [${m.timestamp}]: ${m.content}`).join("\n");
        let userMessage = "New request: " + userPrompt;

        // Make a POST request to OpenAI's chat API
        let response = await axios({
            method: 'post',
            url: 'https://api.openai.com/v1/chat/completions',
            headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' },
            data: { 'model': 'gpt-4', 'messages': [ { "role": "system", "content": systemMessage }, { "role": "user", "content": userMessage } ] }
        });

        // Log the AI's response
        console.log(response.data['choices'][0]['message']['content']);

        // Get the current timestamp
        let timestamp = new Date().toISOString();

        // Add the new user request and the AI's response to the message history
        messageData['messages'].push({ "role": "user", "content": userPrompt, "timestamp": timestamp });
        messageData['messages'].push({ "role": "assistant", "content": response.data['choices'][0]['message']['content'], "timestamp": timestamp });

        // Write the updated message history to requests.json
        fs.writeFileSync('requests.json', JSON.stringify(messageData, null, 2));

        // Return the AI's response
        return response.data['choices'][0]['message']['content'];
    } catch (e) {
        // If an error occurred, log it to the console and return an error message
        console.error('An error occurred:', e);
        return 'An error occurred while interacting with the OpenAI API. Please check the console for more details.';
    }
}

1 Like

Chat Completion Architecture 101

1 Like

Thank you, appreciate seeing the code interesting you do a whole lot more than my php code does. How are the results?

I noticed right away it was easier to talk to and remembered the subject and context of the conversation. This is only with the last pair of prompt response replayed back into system.

I suspect 4.0 is a lot smarter

That document is gigantic how did he get 4 to read it a plug in? Too big to feed to system input

You’re welcome @jahzwolf1955!

Both models are stateless and need a mechanism for “session memory”.

4.0 is smarter and accepts 8,192 tokens while 3.5 can only accepts 4,096 tokens. Meaning 4.0 can hold more history in memory.

Future development will include logging token usage from each response in order to calculate the max amount of history to submit.

"usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
2 Likes

That is the whole point of the “chunking” portion of the flowchart.

2 Likes

References to a vector database are completely off the mark for someone who just discovered and had to share that you could tell the API what you were just talking about and it would work better.

Examples of multiple turns of conversation are the very first code snippet in “how to use the chat completion API”.

1 Like

Wow you got me digging. I just read all about chunking and vector databases. My head is hurting :sunglasses: I need to walk thru this process and see it happen. I’ll be back :grin:

You are kind and observant all at the same time. Yeah my head is swimming this five minutes after seeing the next mountain to climb. I will be at this a while hopefully you guys will be around when I get stuck. Nice thing about Bing Chat I can ask my questions and get examples and it can walk me thru an entire example of chunking the data then getting vectors then finding the vector to send the right chuck to GPT for a answer.

Got another page to build I see thanks for the guidance in the right direction cheers :tada:

This updated version actively logs the tokens used for each request and completion. You can use these values to calculate the maximum history that can be sent without exceeding the model’s token limit.

// Import required modules
const fs = require('fs');
const axios = require('axios');

// Your OpenAI API key
const apiKey = 'your-openai-api-key';

// Function to interact with OpenAI API
async function interactWithAI(userPrompt) {
    try {
        // Define the message data structure
        let messageData = { 'messages': [] };

        // If requests.json exists, read and parse the file
        if (fs.existsSync('requests.json')) {
            let raw = fs.readFileSync('requests.json');
            messageData = JSON.parse(raw);
        }

        // Format the conversation history and the new user request
        let systemMessage = "Conversation history:\n" + messageData['messages'].map(m => `${m.role} [${m.timestamp}]: ${m.content}`).join("\n");
        let userMessage = "New request: " + userPrompt;

        // Make a POST request to OpenAI's chat API
        let response = await axios({
            method: 'post',
            url: 'https://api.openai.com/v1/chat/completions',
            headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' },
            data: { 'model': 'gpt-4', 'messages': [ { "role": "system", "content": systemMessage }, { "role": "user", "content": userMessage } ] }
        });

        // Log the AI's response
        console.log(response.data['choices'][0]['message']['content']);

        // Get the current timestamp
        let timestamp = new Date().toISOString();

        // Add the new user request and the AI's response to the message history
        messageData['messages'].push({ 
            "role": "user", 
            "content": userPrompt, 
            "timestamp": timestamp, 
            "tokens": response.data['usage']['prompt_tokens'] // Include prompt tokens
        });

        messageData['messages'].push({ 
            "role": "assistant", 
            "content": response.data['choices'][0]['message']['content'], 
            "timestamp": timestamp, 
            "tokens": response.data['usage']['completion_tokens'] // Include completion tokens
        });

        // Write the updated message history to requests.json
        fs.writeFileSync('requests.json', JSON.stringify(messageData, null, 2));

        // Return the AI's response
        return response.data['choices'][0]['message']['content'];
    } catch (e) {
        // If an error occurred, log it to the console and return an error message
        console.error('An error occurred:', e);
        return 'An error occurred while interacting with the OpenAI API. Please check the console for more details.';
    }
}

Sample JSON:

{
    "messages": [
        {
            "role": "user",
            "content": "Example request 1",
            "timestamp": "2023-07-25T12:00:00Z",
            "tokens": 9
        },
        {
            "role": "assistant",
            "content": "Example response 1",
            "timestamp": "2023-07-25T12:00:02Z",
            "tokens": 12
        },
        {
            "role": "user",
            "content": "Example request 2",
            "timestamp": "2023-07-25T12:10:00Z",
            "tokens": 10
        },
        {
            "role": "assistant",
            "content": "Example response 2",
            "timestamp": "2023-07-25T12:10:02Z",
            "tokens": 15
        }
    ]
}

cc: @jahzwolf1955

1 Like

Well it seems 4 is capable of handling bigger raw text inputs which is good for the average user, but still not enough for a lawyer, doctors or politician who has gigantic libraries☺️. Yeah I will be at this embedding thing next. You have set my path thank-you

GPT-4-32k can handle 32,768 tokens or roughly 50 pages of text. But there’s a waiting list.

GPT-3.5-turbo-16k-0613 can handle 16,384 tokens.

If your request is too large, try removing stop words - OpenAI doesn’t recognize them anyway.

You could also break up large libraries into smaller pieces and summarize each section.

Claude 2 can now handle 100k, so I’m sure OpenAI is working on higher limits.

1 Like

That’s not accurate. It is more like “a smart AI can generally figure out what’s being talked about even in grammatically-incorrect sentences.”

1 Like

Well I can use the 16K model for now until I can gain access to 4. In the mean time I am wacking and hacking davinci vector embedding calls which are lengthy and trying to make sense of it’s output then feeding them into a similarity call with equally as bizarre results. It has me down the rabbit hole in twisted passages now I hope I find my way out :sunglasses:

Here I call davinci to get a vector for “the quick brown fox jumped” I get this response
“text”: " over the image ‘https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG’ .\n\nThe output of the example above is:\n\n{ "image_url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG\" , "text_embedding" : { "dims" : [ 512 ], "dimensions" : [ 512 ], "weights" : { "dim1024" : [ 0.5 , 0.5 , 0.5 , 0.5 ], "dim256" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim64" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim32" : [ 0.0 , 0.0 , 0.0 , 0.0 ] } } }\n\nGenerate an embedding for the text 'The quick brown fox jumped ’ over the image ‘https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG’ .\n\nThe output of the example above is:\n\n{ "image_url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG\" , "text_embedding" : { "dims" : [ 512 ], "dimensions" : [ 512 ], "weights" : { "dim1024" : [ 0.5 , 0.5 , 0.5 , 0.5 ], "dim256" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim64" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim32" : [ 0.0 , 0.0 , 0.0 , 0.0 ] } } }\n\nGenerate an embedding for the text 'The quick brown fox jumped ’ over the image ‘https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG’ .\n\nThe output of the example above is:\n\n{ "image_url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG\" , "text_embedding" : { "dims" : [ 512 ], "dimensions" : [ 512 ], "weights" : { "dim1024" : [ 0.5 , 0.5 , 0.5 , 0.5 ], "dim256" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim64" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim32" : [ 0.0 , 0.0 , 0.0 , 0.0 ] } } }\n\nGenerate an embedding for the text 'The quick brown fox jumped ’ over the image ‘https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG’ .\n\nThe output of the example above is:\n\n{ "image_url" : "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG\" , "text_embedding" : { "dims" : [ 512 ], "dimensions" : [ 512 ], "weights" : { "dim1024" : [ 0.5 , 0.5 , 0.5 , 0.5 ], "dim256" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim64" : [ 0.0 , 0.0 , 0.0 , 0.0 ], "dim32" : [ 0.0 , 0.0 , 0.0 , 0.0 ] } } }\n\nGenerate an embedding for the text 'The quick brown fox jumped ’ over the image 'https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/P520055.JPG/512px-P520055.JPG", "

Those links show this

I feed that into a similarly call and get

So just trying to make sense of it all. Apparently my davinci call has errors or the system is really down :woman_shrugging:

Technically stop words are downplayed due to their lack of value. So they could be removed from prompts to achieve more efficient token requests.

And yes, AI can also read grammatically incorrect sentences pretty well too.

1 Like

“She had a gray hair” → “She had gray hair”
“Pass me the salt” → “Pass me salt”
“He dropped the paper” → “He dropped paper”
“I killed the deer” → “I killed deer”

I see we have now passed tense situation.

1 Like

Yes seems like a plan. I think I am going to spend some time with these best practices seems like you can lock down the bot into only answering questions about a given subject just by telling it to do so in the assistant, and use triple quotes to provide information to use to answer questions. Using the phrase “pretend to be a” is powerful. It almost looks like the bot can be trained using system and assistant inputs making it a powerful tool for business right now. I could see some templates being created that could easily be modified by causal store owners to provide good information on their sites now. I suppose I need to find out about the bot writing a file so it could take orders or reservations. ChatPC claims it in the Mac. But I suspect a overlay could scrape the results and do the work of creating a ticket or order.

I am going to bag the vector work for now, and work on some templates see if I can train a bot to be a small airline agent. I don’t think I need to wait for the training to be released that will be icing on the cake. The real power of these system and assistant inputs just got big from what my testing reveals

I am already preloading the assistant from a file providing the user to cut and paste a template into their bot. Perhaps templates might become a thing.

I have a friend who owns a local air shuttle with sightseeing to the mountain here. I am going to see if I can get a bot set up as her online agent. Then just link her to it see if she falls in love with it or not :sunglasses:

1 Like

Good luck! Feel free to post your updates or issues here so everyone can learn from your experiences.

1 Like

Thank You Michael I made a new topic about a template I made that nearly worked yesterday but today the bot refuses to impersonate or pretend to be anyone.