Buggy assistants behaviour: dozens of messages appended per run

patrickk · February 8, 2024, 12:07am

Hi there, I’m trying to assess some assistants behaviour and I can’t tell if it’s behaving strangely due to my input or due to a bug.

I’m seeing that a single run is causing dozens of messages (in one case around 35) to be appended to a thread. In my case: I’m using gpt-3.5-turbo, I’m beginning with a thread consisting of some user and assistant messages, and I’m creating a run with some override instructions directing the assistant to classify the conversation so far and respond with a small bit of JSON. All of the resulting messages seem to be well formed JSON and appropriate responses to the instructions, I just don’t understand why there are dozens of messages.

An edit for clarity about what I’m doing: I start this classification run right after completing another run, where the assistant has appended a message responding to the user. I don’t append any additional messages before starting the classifier run. It strikes me that this may be an unusual usage pattern for the assistants API.

Has anyone else encountered something like this?

patrickk · February 8, 2024, 1:23am

I’ve whipped up a script which reproduces the issue reliably. I think it’s a bug. The forum won’t let me post a link, so here it is inline (Ruby):

require 'httparty'
require 'awesome_print'

# This script demonstrates an issue with the OpenAI Assistants API. The second run started in this script results in dozens of messages being appended to the thread, where we might have expected just one.

# The above two gems are prerequisites, run:
# gem install httparty awesome_print

# You must also fill in your OpenAI API key, and the ID for an assistant. Any assistant should do.


API_KEY = "**** YOUR KEY HERE"
ASSISTANT_ID = "**** YOUR ASSISTANT ID HERE"



common_headers = {
      "OpenAI-Beta" => "assistants=v1",
     "Content-Type" => "application/json",
    "Authorization" => "Bearer #{API_KEY}"
}



# Make a thread

response = HTTParty.post("https://api.openai.com/v1/threads", headers: common_headers)
thread = JSON.parse(response.body)


# Populate the thread with one user message, and one assistant message after the first run completes.

response = HTTParty.post("https://api.openai.com/v1/threads/#{thread['id']}/messages", headers: common_headers, body: {
    role: 'user',
    content: "Is it legal to ride your bike on the sidewalk?",
  }.to_json
)
message = JSON.parse(response.body)

response = HTTParty.post("https://api.openai.com/v1/threads/#{thread['id']}/runs", headers: common_headers, body: {assistant_id: ASSISTANT_ID}.to_json)
run = JSON.parse(response.body)

while true
  puts "First run, polling ... "

  response = HTTParty.get("https://api.openai.com/v1/threads/#{thread['id']}/runs/#{run['id']}", headers: common_headers)
  polled_run = JSON.parse(response.body)

  break if polled_run['status'] == 'completed'
  sleep 1
end

response =  HTTParty.get("https://api.openai.com/v1/threads/#{thread['id']}/messages", headers: common_headers)
messages = JSON.parse(response.body)
ap messages


# For the second run, we set up override instructions which direct the assistant to output JSON, and we don't append any additional messages.

body = {
  assistant_id: ASSISTANT_ID,
  instructions: "Your task is to choose one of three outcomes based on the based on the recent interactions between the User and the chatbot. You may choose: a) \"happy\", b) \"sad\", or c) \"neutral\" depending on the sentiment of the conversation. Your task is NOT to answer the User's question or respond to the user's input.

    Your output should consist of valid JSON. Your output must be one of:

      {\"sentiment\": \"happy\"}
      {\"sentiment\": \"sad\"}
      {\"sentiment\": \"neutral\"}

    Your output must consist ONLY of one of these three JSON outputs."
}

response = HTTParty.post("https://api.openai.com/v1/threads/#{thread['id']}/runs", headers: common_headers, body: body.to_json)
run = JSON.parse(response.body)

while true
  puts "Second run, polling ... "

  response = HTTParty.get("https://api.openai.com/v1/threads/#{thread['id']}/runs/#{run['id']}", headers: common_headers)
  polled_run = JSON.parse(response.body)

  break if polled_run['status'] == 'completed'
  sleep 1
end


# The messages response will contain the maximum of 20 messages normally returned for one messages retrieval. There may be even more messages in the thread now.

response =  HTTParty.get("https://api.openai.com/v1/threads/#{thread['id']}/messages", headers: common_headers)
messages = JSON.parse(response.body)
ap messages

supershaneski · February 8, 2024, 1:58am

I tested in my own code and can confirm that it does run around for a number of tries, sending messages, when the second assistant is called.

Edit:
I checked the generated messages, what it was trying to do and what functions it was calling. Further testing and trying to handle what it is doing reduced it to suitable number in my case.

patrickk · February 8, 2024, 3:16am

What kinds of things did you do to change how the assistant behaves in this case?

supershaneski · February 8, 2024, 4:20am

My first test is pretty inconclusive. It could have been that I was missing some tool handlers.

So I tested your example. Using gpt-3.5-turbo-0125, I was not able to recreate the problem.

[
    {
        "id": "msg_IXH9WtIWoBcLt1fzXXR92Fda",
        "object": "thread.message",
        "created_at": 1707365501,
        "thread_id": "thread_OhncHRUg3WtmssAdlH9C5X6s",
        "role": "assistant",
        "content": [
            {
                "type": "text",
                "text": {
                    "value": "{\"sentiment\": \"neutral\"}",
                    "annotations": []
                }
            }
        ],
        "file_ids": [],
        "assistant_id": "asst_ASSISTANT_2",
        "run_id": "run_FVuS9kIc1fBL6OFzG7NArajm",
        "metadata": {}
    },
    {
        "id": "msg_Ah9pJ2PrsmD2oBvdARj5H2LZ",
        "object": "thread.message",
        "created_at": 1707365493,
        "thread_id": "thread_OhncHRUg3WtmssAdlH9C5X6s",
        "role": "assistant",
        "content": [
            {
                "type": "text",
                "text": {
                    "value": "Laws regarding riding bikes on sidewalks vary depending on the local jurisdiction. In many places, it is legal to ride a bike on the sidewalk, but some areas have specific restrictions or regulations. It's best to check with your city or town's laws or local authorities to determine if it is legal to ride a bike on the sidewalk in your area.",
                    "annotations": []
                }
            }
        ],
        "file_ids": [],
        "assistant_id": "asst_ASSISTANT_1",
        "run_id": "run_7YVpfm4tOiA4UZTyiDNQQrUW",
        "metadata": {}
    },
    {
        "id": "msg_npAGzQOriEefWRe3jwjm9RyT",
        "object": "thread.message",
        "created_at": 1707365492,
        "thread_id": "thread_OhncHRUg3WtmssAdlH9C5X6s",
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": {
                    "value": "Is it legal to ride your bike on the sidewalk?",
                    "annotations": []
                }
            }
        ],
        "file_ids": [],
        "assistant_id": null,
        "run_id": null,
        "metadata": {}
    }
]

You did not give the first assistant’s instruction so I just used “You are a helpful assistant.”. No retrieval and no functions on both Assistants.

Topic		Replies	Views
Assistant is repeating itself in a single run API gpt-35-turbo , assistants , assistants-api	4	1635	January 21, 2024
Vastly Different Responses (Assistant Playground vs. API) API	10	3822	June 20, 2024
Assistant behaved differently in Playground and during an API call API gpt-4 , api	11	1651	November 14, 2023
Can assistant complete run without any message generated? API assistants , assistants-api	2	1766	December 15, 2023
Back to back Assistant messages in Thread — my bug? API assistants-api	4	472	March 16, 2024

Buggy assistants behaviour: dozens of messages appended per run

Related topics