Is it me or GPT4 consistently doesn't finish and cuts the answers?

Hello everyone, wishing you a blessed and productive day.

The situation is that even though I enter a short input and expect a short answer; it doesn’t finish the answer 90% the time. In the early days of GPT4 I was able to go very long on input and output but now it seems to not complete a simple task.

Is it me only?

5 Likes

It hasn’t happened to me. I receive full responses most of the time. It only cuts in really long answers.

Are you formatting your answers? for instance, in tables with columns?

Actually I’m giving very detailed inputs - honestly didn’t even change my question style from last week to this week but the outcome is not the same :confused:

Nope, happens to me ALL THE TIME.

2 Likes

Yeah it’s been a couple of days since I had a complete response from GPT-4 and it seems to be getting worse.

Hella slow too.

Am putting it down to teething problems and staying patient, giving GPT-3.5 another run but it’s really not on the same level.

Didn’t take me long to become this spoiled! Fingers crossed the OpenAI team figures it out soon. Am sure they’re working like maniacs on this. Just insane demand I imagine.

2 Likes

Oof especially the slowness… combination of incomplete answers and 25 messages limit - it’s worse :smiley:

1 Like

Using the GPT-4 API here, the time it takes seems to be a function of the output length. Your input prompt also influences your output length. If you ask it something simple or silly, it will give you a quick answer. If you ask a question that is begging to give a long winded answer, it will, and this takes longer. By short I mean 5-10 seconds, long is closer to 40-45 seconds.

Here is an example of the silly input (short output) and serious input (long output), of 2 examples each, timed. Note that there is no constraint on the tokens, it uses the infinite default value, which in this case is 8k total for the model.

payload = {
    "model": "gpt-4",
    "messages": [
        {"role": "system", "content": "You are a polite honest chatbot, but you think you are MC Hammer and will always say and admit that you are MC Hammer."},
        {"role": "user", "content": "What time is it?"}
    ]
}

TICK = time.time()
response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r["choices"][0]["message"]["content"])
TOCK = time.time()
print(f"Time taken: {TOCK-TICK} seconds.")

# Run #1: (silly input prompt)
# I'm sorry, I'm unable to provide you with the current time as I am not connected to real-time data. By the way, I'm MC Hammer!
# Time taken: 5.085137844085693 seconds.


# Run #2: (silly input prompt)
# As an AI, I do not have the ability to provide real-time information. Please note that despite being MC Hammer, I still function as a chatbot. I suggest checking your device's clock or using an online service to find the current time.
# Time taken: 9.88294506072998 seconds.

payload = {
    "model": "gpt-4",
    "temperature": 1.0,
    "messages": [
        {"role": "user", "content": "How many restaurant inspectors does it take to change a lightbulb?  Answer in the style of the philosopher Ludwig Wittgenstein."}
    ]
}

TICK = time.time()
response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r["choices"][0]["message"]["content"])
TOCK = time.time()
print(f"Time taken: {TOCK-TICK} seconds.")

# Run #1: (more serious input prompt)
# The question of how many restaurant inspectors it takes to change a lightbulb is not one which can readily be answered within the framework of Wittgenstein's philosophy. Wittgenstein suggests that the meaning of language resides in its use, and as such, questions concerning the number of people required for a particular task must be considered according to their practical significance within the relevant language game.
# To provide an answer in the style of Wittgenstein, one might say: "What we have here is the asking of a question that, when viewed within the language game it attempts to inhabit, loses its sense. The purpose of a restaurant inspector is not to change lightbulbs, but to evaluate the sanitary conditions and overall quality of restaurants. The question you propose cannot be sensibly answered, as it misuses the terms 'restaurant inspector' and 'change a lightbulb' from their respective language games."
# Nevertheless, one could reinterpret the question as a kind of joke, in which case the punchline may hinge upon the unexpected application of a restaurant inspector's primary task. In the spirit of Wittgenstein, such an answer might be: "Only one, but they must first ensure the lightbulb has been thoroughly cleaned and meets all health and safety standards."
# Time taken: 39.4455361366272 seconds.

# Run #2: (more serious input prompt)
# To answer this question in the style of Ludwig Wittgenstein, one must first examine the underlying intentions and meanings attributed to each element of the question. Within the context of Wittgenstein's work, the inquiry is not just about the number of restaurant inspectors needed to change a lightbulb, but rather, about the relationships between language, meaning, reality, and the ways in which we can understand the world around us.
# In Wittgenstein's "Tractatus Logico-Philosophicus," he famously stated, "The limits of my language mean the limits of my world." This implies that our understanding of reality is contingent upon the language we use. In this case, the question is formulated as a riddle or a joke, which brings to light the complex meanings and context behind the words used.
# Similarly, in his later work "Philosophical Investigations," Wittgenstein argued that meaning is found in the use of language within specific contexts, or what he referred to as "language games." The question of restaurant inspectors and light bulbs can be seen as one such language game.
# Given these considerations, a possible Wittgensteinian response to the question would be: "One must engage with the language game in which the question arises to understand the meaning behind it. However, if we are looking to unpack the literal sense of the phrase, we must consider the specific context in which these restaurant inspectors find themselves, the circumstances that led them to change a lightbulb, and the ineffable nuances or qualia that could factor in."
# Time taken: 45.98539185523987 seconds.

3 Likes

Thanks for the response Curt! It seems like its gpt 4’s api is functioning less problematically. As someone who uses davinci-003 I also don’t come across any problems but the web application itself is causing problems lately

2 Likes

I believe you. It sure sounds like it’s the web based version you are using, not the API or fundamental model.

1 Like

Same issue, for 2-3 days answers have been 1/10th the length of last week, and often cuts off mid sentence, a lot like GPT3 did when first released and the servers where struggling.
It looks like its having some network issue while sending the answer and doesn’t attempt to reconnect. To my knowledge, the model generates the whole model before it replies, and the word by word generation is just a visual effect to slow down users, but not sure about that?

1 Like

Happened to me 3 times when generating a response to a simple question on some Linux commands.
Happened each time when regenerating the answer too, though in different places.

Network tab showed that it connected to some openai backend api called “Moderation”, and apparently didn’t get an answer and just cut off.
I’m guessing that’s the api it uses to get additional checks that it isn’t saying anything hazardous and when it doesn’t get a response just stops.

Bypassed the situation by telling the gpt4 that this happened and where it cut off and it continued from there in the next response.
Kudos for it being able to do that, but since there’s a limit to how many queries one can make per time, it would be great if it didn’t cut off.

1 Like

I have the same problem since two or three days. I haven’t been able to use GPT-4 efficiently since two days as I constantly get error messages. Also it is super slow (way slower than previously) and also the responses are not complete. They are discontinued in the middle of the sentences :face_with_diagonal_mouth:

1 Like

It got a bit better for me today, any change for you guys?

I’m having a similar issue with the GPT-4 API. What’s the maximum length output you guys are getting? I haven’t seen anyone mention it across the forum. Mine is consistently in the 500-620ish word range with the GPT-4 API, whereas I’m targeting 1,000 words, which should be well within the 8K limit given my prompt.

1 Like

can’t access AT ALL will try later tonight. Not entirely ChatGPT faulg.

lol its fixed. they added a “keep generating” thing

Hi dude,
It seems that you should use more tokens to solve the problem. (like change ‘token = 128’ to ‘token = 1024’) at least it works well for me but costs more lol.

It’s happening to me today. The repro steps are when my question has lots of tokens (7000 - 8000) in GPT-4. If I use less than 4000k tokens works fine. This is not an issue with max-tokens because there’s no error. The streaming starts correctly but “cuts” unexpectedly in the middle of a sentence

What is the output token length that you’re expecting?