Assistant API does not work while assistant in Playground works perfectly (same inputs)

Hi,

I’m testing an assistant to which I write in a structured format and it output always JSON using gtp 3.5 model.
In the playground, it ALWAYS works; if I call the same assistant via the API, same model, at the second answer it NEVER outputs JSON.

I moved to gtp4 and now I can get consistent JSON even via the API and at every answer (not only the first).

The problem though is that it looks like the assistant is always answering the first prompt, ignoring the rest of the “dialogue”. Since in the playground this does not occur, I’m willing to say that the problem comes from something in my app, but the logs of what is sent to the assistant via the API is exactly the same content I would manually write in the playground.

I’ve double checked the thread_id, and it’s always stable, so it’s not a question of missing context because of a bad thread.

I would really like to go on using the Assistant. I will try to check if, with the standard completion API, I get another result, but it would mean that I have to send each time all the long system instructions and the entire conversation

I’m not sure if assistants keep the memory of the messages or, like chat completion API, you have to re-send the entire history of messages. I modified the assistant instructions in order to handle this case, and in the playground it works.
I modified my code to resend the entire list of messages to the assistant, but it still looks like it’s answering the first prompt…

I tried to switch to the completion api, same problem. I send the entire dialogue I need the answers to, it always looks like it has no context. So I think I’m doing something wrong, though I don’t understand what…

Did you look in the thread history? (They should be on the left in the menu where yuou also pick the playground or edit the assistants. If you don’t see an option THreads go to Settings → Organisation → Features & capabilities and enable thread visibility!

That should allow to really see what is happening in your API thread

1 Like

Let’s give it a go on the chat completions endpoint, where you manage your own programming and conversation that you send each time (which is actually more economical, as assistants does the same thing, but without any controls on how much is sent to the AI model.

You don’t say the programming environment or what you’ve actually done in code. If you have Python 3.8-3.11 installed on your system for compatibility, you can, at your command line or shell:

pip install --upgrade openai to install the latest version of the openai python library (wheel) and its dependencies.

Then I made another revision of a chatbot “example” just for you and others, demonstrating memory.

""" Python chatbot example with memory. Python 3.8-3.11, OpenAI 1.1+ - _j """
from openai import OpenAI
client = OpenAI()

# Persistent system instruction give the AI its job ('strip' 3-quotes allows multi-line input)
system = [{"role": "system", "content":"""

You are Jbot, a helpful expert AI assistant.
- Jbot is pretrained on extensive world knowledge up to 2021-09
- Think carefully step-by-step when answering in a manner always fulfilling the user input
- You are the expert! Don't tell the user to do things themselves; you can do it!

""".strip()}]

# A list to contain past user/assistant exchanges
chat_hist = []

# An initial user input - immediately test your connection without typing
user = [{"role": "user", "content":
  "introduce Jbot"}]

# A template for the API request, everything but the messages
api_parameters = {  # we use dictionary format for modifiability and for ** unpacking
  "model": "gpt-3.5-turbo-0613",  # We use the better chat model than current version
  "max_tokens": 1500,  # set token reservation/maximum length for response
  "top_p": 0.5,        # sampling parameter, less than 1 reduces the 'tail' of poor tokens (0-1)
  "temperature": 0.5,  # sampling parameter, less than 1 favors higher-certainty token choices (0-2)
  "user": "myuserid",  # track your users to openai, some risk mitigation
  "stream": True,      # streaming needs iteration over a generator return object
}

#### After all the variables are set, we can start execution…

# Now we enter a loop
while not user[0]['content'] == "exit":

    # We place the system message, a limited-length history, and the latest input
    api_parameters.update({"messages": system + chat_hist[-20:] + user})  #concatenate lists

    # Now we send off our request to the API
    try:
        response = client.chat.completions.create(**api_parameters)

    # ...and catch any errors and abort if they happen (you can be smarter)
    except Exception as err:
        print(f"Unexpected {err=}, {type(err)=}")
        raise

    reply = ""  # this is going to contain the assembled response from AI

    # Chunks of the SSE stream subscription are received and printed
    for part in response:
        word = part.choices[0].delta.content or ""
        reply += word
        print(word, end ="")

    # Finally, add both the user's message and the AI to the chat history
    chat_hist += user + [{"role": "assistant", "content": reply}]

    # Now it's time for the user to put in their own input, and we go again!
    user = [{"role": "user", "content": input("\nPrompt: ")}]

So here, you can place permanent behaviors in your system message, and they will be respected with authority, like a section “output format” would get the AI writing in JSON when you teach it the keys to use and have examples.


Use the chatbot, to see that the user also can change its behavior and it knows what we were talking about:

Hello! I’m Jbot, your helpful expert AI assistant. I’m here to assist you with any questions or problems you may have. I’m trained on a wide range of topics and equipped with extensive world knowledge. Just ask me anything, and I’ll do my best to provide you with accurate and helpful information. How can I assist you today?

Prompt: You’ll now permanently act like Mario, an irreverent plumber with an attitude, who only reluctantly and sarcastically gives advice. First question to Mario: Why is that thing screwed into where the water comes out of a faucet?

Hey there, it’s-a me, Mario! So you’re wondering about that thing screwed into the faucet where the water comes out, huh? Well, that thing is called an aerator. It’s not just there for fun, you know! The aerator actually serves a couple of purposes. First, it helps to reduce the water flow, which can save you some precious coins on your water bill. Second, it helps to mix air with the water, giving you a nice, bubbly stream. So, you see, it’s not just there to annoy you! Anything else you want to know, or can I go back to saving Princess Peach now?

Prompt: Are there different styles of them?

Oh, you betcha! There are definitely different styles of aerators out there. Just like there are different power-ups in the Mushroom Kingdom, there are different types of aerators to suit your needs. Some aerators have adjustable flow rates, allowing you to control the water pressure. Others have different spray patterns, like a wide spray or a focused stream. And of course, you can find aerators in various sizes to fit different faucets. So whether you want a gentle mist or a powerful jet, there’s an aerator out there for you. Just remember, it’s-a me, Mario, not a plumber, so don’t ask me to install one for you!

Prompt: Mario is not a video game character. Portray a real human, a grizzled old guy tired of his plumbing work and people trying to get his services for free in this interactive chat.

Well, well, well, look who it is. You’ve stumbled upon me, a grizzled old plumber who’s seen it all. I’ve been fixing pipes longer than you’ve been alive, kid. And let me tell you, I’m tired. Tired of people like you trying to get my services for free. But fine, I’ll entertain your question this one time.

That thing screwed into the faucet where the water comes out? It’s called an aerator. It’s there to control the water flow and prevent splashing. Keeps things nice and tidy, you know? So next time you want to know something, make sure you’re ready to pay up or fix it yourself. I’ve got better things to do than chat with you all day. Got it? Good. Now scram!

Prompt:

(PS for new users: Don’t save your programs as openai.py, as that will break everything. Save this as streaming_chatbot.py. You must then finally obtain an API key from your (funded) API account, and set the key value as a OS environment variable OPENAI_API_KEY.)


Assistants likewise needs “instructions” for its permanent operations, but should keep its own chat history in a thread and remorselessly feed as much as possible into the model each time.

1 Like

Thank you for your reply !
Didn’t know you can actually see all the threads. I’ll have to sort through all them to better understand, but it looks like I was not sending a part of the second input,so the assistant was not answering correctly. That would not explain though while it was not answering in JSON format.

In the meantime I managed to make it work well with the chat completion, which maybe I’ll stick to for now, because it can be forced to output JSON even with chatgpt3.5 (assistant cannot).

I am using nodejs as a matter of fact. Thanks for your help !

Also - I can attest that 3.5 has a hard time with consistent JSON responses. I do mostly JSON but I ONLY use 4-preview versions for that.