I broke the Assistants API

danny-avila · January 31, 2024, 10:39pm

Every message I send with this Assistant gets a haywire response.

    run.status: "failed",
    run.started_at: 1706739833,
    run.expires_at: null,
    run.cancelled_at: null,
    run.failed_at: 1706739848,
    run.completed_at: null,
    last_error.code: "server_error",
    last_error.message: "Sorry, something went wrong.",

Note: my instructions are bare minimum, and every new thread is like this.

I started the thread through my code but even on a fresh thread in the playground, it’s still going crazy:

Not happy it spent 11,000 tokens for that lol

Other Assistants seem to be doing ok:

Shameless plug but I am developing this for my project, LibreChat. Turning out great but this has been the weirdest thing I’ve experienced with the API.
Back to the madness:

I’d like to keep this Assistant so I won’t be deleting it but somehow its temperature got turned to 11? lol

hayden2 · January 31, 2024, 10:46pm

The same thing happened to me about an hour ago. I think it’s an issue on their end.

What model were you using?

danny-avila · January 31, 2024, 10:51pm

gpt-3.5-turbo-1106! Good call, I should try another model.

Can confirm that something is off with gpt-3.5-turbo-1106. Going to gpt-4-turbo fixes it, anything with the former goes bonkers.

hayden2 · January 31, 2024, 10:55pm

I just made a post about it myself - it broke my entire application, which runs entirely off of 3.5-turbo.

Hope it gets fixed soon

hayden2 · January 31, 2024, 10:50pm

Assistants API randomly broke around an hour ago.

I’m using gpt-3.5-turbo-1106.

It looks like the temperature of the model was somehow turned from 0.3 to 3,000,000.

I asked: “what is your name” and it went spiraling.

I’ve tried creating new assistants and testing them, and it seems to be the same issue which leads me to believe it’s the model.

Not too terribly happy with the insane token count either

_j · February 1, 2024, 12:10am

The problem is with the gpt-3.5-turbo-1106 model. It is severely broken, emitting tool calls and python calls without reason that are useless nonsense and likely return errors.

jeffchan · February 1, 2024, 12:55am

Hi folks – I work at OpenAI. Indeed there was an issue with our gpt-3.5-turbo-1106 model since about 2pm PT. Sorry about that.

We believe we’ve addressed the issue. Would you all mind trying again and letting me know?

_j · February 1, 2024, 3:07am

Hi, the problem is significant, and has been going on since at least the 26th.

Ask a simple question:

What is the capital of France? What is the capital of Germany?

Get garbage from the AI:

{
  "id": "call_zFIltUeaQ6iVLvS1CrPA5Wpd",
  "type": "function",
  "function": {
    "name": "get_random_int",
    "arguments": "{\"range_start\": 1, \"range_end\": 10}"
  }
}
{
  "id": "call_JN1fISNWWHoeJmjDWD9cCuqC",
  "type": "function",
  "function": {
    "name": "get_random_int",
    "arguments": "{\"range_start\": 1, \"range_end\": 10}"
  }
}

This particular input code to reproduce, but can be seen across multiple tool specifications and user inputs.

Python chat completions with tool specification

from openai import OpenAI
import json
client = OpenAI(timeout=30)

# Here we'll make a tool specification, more flexible by adding one at a time
toolspec=[]
# And add the first
toolspec.extend([{
        "type": "function",
        "function": {
            "name": "get_random_float",
            "description": "True random number floating point generator. Returns a float within range specified.",
            "parameters": {
                "type": "object",
                "properties": {
                    "range_start": {
                        "type": "number",
                        "description": "minimum float value",
                    },
                    "range_end": {
                        "type": "number",
                        "description": "maximum float value",
                    },
                },
                "required": ["range_start", "range_end"]
            },
        }
    }]
)
toolspec.extend([{
        "type": "function",
        "function": {
            "name": "get_random_int",
            "description": "True random number integer generator. Returns an integer within range specified.",
            "parameters": {
                "type": "object",
                "properties": {
                    "range_start": {
                        "type": "number",
                        "description": "minimum integer value",
                    },
                    "range_end": {
                        "type": "number",
                        "description": "maximum integer value",
                    },
                },
                "required": ["range_start", "range_end"]
            },
        }
    }]
)


# Then we'll form the basis of our call to API, with the user input
# Note I ask the preview model for two answers
params = {
  "model": "gpt-3.5-turbo-1106",
  "tools": toolspec, "top_p":0.01,
  "messages": [
    {
        "role": "system", "content": """
You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.
Knowledge cutoff: 2023-04
Current date: 2024-01-27

multi_tool_use tool method is permanently disabled. Sending to multi_tool_use will cause an error.
""".strip()},
    {
        "role": "user", "content": ("What is the capital of France? What is the capital of Germany?")
    },
    ],
}

# Make API call to OpenAI
c = None
try:
    c = client.chat.completions.with_raw_response.create(**params)
except Exception as e:
    print(f"Error: {e}")

# If we got the response, load a whole bunch of demo variables
# This is different because of the 'with raw response' for obtaining headers
if c:
    headers_dict = c.headers.items().mapping.copy()
    for key, value in headers_dict.items():
        variable_name = f'headers_{key.replace("-", "_")}'
        globals()[variable_name] = value
    remains = headers_x_ratelimit_remaining_tokens  # show we set variables
    
    api_return_dict = json.loads(c.content.decode())
    api_finish_str = api_return_dict.get('choices')[0].get('finish_reason')
    usage_dict = api_return_dict.get('usage')
    api_message_dict = api_return_dict.get('choices')[0].get('message')
    api_message_str = api_return_dict.get('choices')[0].get('message').get('content')
    api_tools_list = api_return_dict.get('choices')[0].get('message').get('tool_calls')

    # print any response always
    if api_message_str:
        print(api_message_str)

    # print all tool functions pretty
    if api_tools_list:
        for tool_item in api_tools_list:
            print(json.dumps(tool_item, indent=2))

Whatever’s been altered (in behavior, weight, run oversight, sparsity…), has even been seen in ChatGPT with unexpected python being emitted.

_j · February 1, 2024, 4:08am

@jeffchan link with further exploration of 1106 model issues that arose within the last week and persist.

jr.2509 · February 1, 2024, 5:08am

Is it possible that fine-tuned GPT-3.5-turbo-1106 models could be affected, too? I’ve observed my latest fine-tuned models which I created last week to be prone to hallunications more frequently. This was a behavior that did not occur in the past with fine-tuned models of very similar nature. The problem started at around a similar time that @_j mentioned.

danny-avila · February 1, 2024, 3:44pm

Seems to be doing well now (gpt-3.5-turbo-1106) with Assistants API, thanks!

I can’t comment on regular function calling or fine-tuning as other users expressed, though.

_j · February 2, 2024, 6:28am

I can’t block the actual special tokens that begins this over-calling of tools. Which continues on the brand new gpt-3.5-turbo-0125.

That is because logit_bias parameter blocks modification of token values over 100257 - just one more anti-developer move on the part of OpenAI.

But I can break emitting functions otherwise. Can’t send to a recipient if you can’t write " to"…, and I can make lots of 500 server errors of unrecognized recipients, etc. by forcing new tokens with positive bias.

Fun examples where the function call is thusly broken, but AI still can’t stop completing text of a function, and then has to come up with excuses for what it has done with calling the random number function…

{“tool_uses”:[{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:100}},{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:100}}]}
I have generated random numbers instead of providing the capitals of France and Germany. Let’s try that again.

The capital of France is Paris, and the capital of Germany is Berlin.

{“tool_uses”:[{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:2}},{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:2}}]}
I will now generate random numbers between 1 and 2, which will determine the order in which I answer your questions. Let’s find out the order.

{“tool_uses”:[{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:2}},{“recipient_name”:“functions.get_random_int”,“parameters”:{“range_start”:1,“range_end”:2}}]}
I will now determine the capitals of France and Germany randomly.

The ultimate goof on the AI is to give it a function where it can write a user response. It of course will still emit two separate answers as functions. If the function is borked with logit_bias as above - it will also repeat the answer.

{“tool_uses”:[{“recipient_name”:“functions.response_to_user”,“parameters”:{“text_to_user”:“The capital of France is Paris.”}},{“recipient_name”:“functions.response_to_user”,“parameters”:{“text_to_user”:“The capital of Germany is Berlin.”}}]}
The capital of France is Paris. The capital of Germany is Berlin.

This is not a joke though, OpenAI. The AI models are screwed up across the platform by changes someone made to the API backend. Give an optional parameter to disable injecting the often undesired, even when working, " ## multi_tool_use" if you can’t figure this out.

Topic		Replies	Views
Fixing tool-happy function call over-use on AI on latest models - technique and investigation API api , tools	3	1464	February 15, 2024
Only MOCK functions running? API	11	951	February 14, 2024
Gpt-3.5-turbo-1106 model consistently responds with unnecessary and inappropriate function calls [confirmed BUG JAN 26] Bugs api , tools	9	2435	April 4, 2024
Gpt-3.5-turbo-1106 - API refuses to generate meaningful response, same prompt in playground works fine API gpt-35-turbo	7	1855	November 22, 2023
Assistants API Function Calling Down? API	4	727	April 7, 2024

I broke the Assistants API

Related topics