Alternative to Assistant or how to reduce response time?

jblake · July 14, 2024, 5:37pm

I currently store All Assistants, Threads and messages in our database.
When using an Assistant it goes from 1 second to 10+ second even when streaming, when instruction over 8k and large amount of messages on a thread. I believe this is because they send all of the conversation and the instructions back for every response.
I’d like the AI to have history of previous messages, specific instructions, files attached to the AI but allow user to add files.
What is the best way to get fast responses with these features?
Can I store the previous messages and attach it as a file to the Assistant maybe?
Why doe OpenAI return so much data everytime and can we turn off everything but the message?

razvan.i.savin · July 14, 2024, 9:35pm

First, retrieve only last 2 messages, do you retrieve messages with default value of 20? params = {‘order’: ‘desc’, ‘limit’: ‘20’}

how i retrieve messages:

github.com

SavinRazvan/flexiai/blob/main/chat.py

# chat.py
import os
import logging
import platform
from flexiai.core.flexiai_client import FlexiAI
from flexiai.config.logging_config import setup_logging

def clear_console():
    """Clears the console depending on the operating system."""
    if platform.system() == "Windows":
        os.system('cls')
    else:
        os.system('clear')

def main():
    # Set up logging using your custom configuration
    setup_logging()
    
    # Initialize FlexiAI
    flexiai = FlexiAI()

This file has been truncated. show original

jblake · July 15, 2024, 5:43pm

thank you for the suggestion but its the time from the request to the first delta delivery when streaming since the assistant sends all of the history with every request.

For example a new thread will return a message in 0.5 seconds but a thread with 10 message and 8k of text will take 20+ seconds, even if the current question is only a sentence.

https://platform.openai.com/docs/api-reference/runs

Topic		Replies	Views
Assistant with document_search enabled - long response time API gpt-4 , assistants-api	1	397	June 7, 2024
Why Assistants API is Slow? Any speed solution? API api-speed , openai , rag , assistants-api	15	9057	September 10, 2024
Speeding up the response from the openai's assistant api API gpt-4 , assistants-api	2	2238	July 17, 2024
How I can send user messages towards an assistant with less api calls? API assistants-api	0	94	November 12, 2024
Assistants API - best way to get the reply to a given user message? API assistants-api	6	4225	July 10, 2024

Alternative to Assistant or how to reduce response time?

Related topics