"chat" wrt chat/completions

I know the API is stateless, but isn’t the word “chat” misleading? Chat conveys a conversational model (ie, chat between 2 people), which means ongoing context. But the chat/completions endpoint is stateless and you need to explicit manage the context in your code.

Or am I just being pedantic and should just assume “chat/completions” is just an ordinary, stateless completion?

It is “chat” because it presents you with a model and an API interface that only accepts “messages”, not a completion AI where you have full control of a prompt up to where the AI continues where you left off.

The chat AI acts like it is a persona that exists, from fine-tune reinforcement learning on the chat format.

On a normal completion AI, you would have to “prompt” some sort of lead up that talks about observing a conversation between two individuals, where the language completion AI completes what the other party would say.

So “chat” is about the rigidity and inflexibility of an AI that pretends it is a person and an API that doesn’t let you escape your “user” role. It also terminates the output with a trained stop sequence that is out of your control.

This is completion, the AI writing the colored part (and davinci-002, an obviously downsized model worse than GPT-3 from 2020)

Note that the AI started repeating, and then concluded the “chat” by not prompting itself any more. There’s not enough context to know that this isn’t within a chapter of a book or something. However, the AI starts to go way off the rails after this, forgetting about the scientists and Bobo.


Appreciate the response.

I did a lot of ChatGPT 3.0/3.5 work 18 months ago, so I was quite used to the ‘pure’ completions model where you can enter “roses are red, violets are” and the LLM responds “blue”. The chat/completions endpoint seems to need a full human text message.

So is the “chat” model just applying human-type of responses, expecting full sentences, etc? I assume the “chat” model API is still stateless and doesn’t understand context unless you pass it along.

What’s confusing (at least to me) is “chat” vs “chatbot”. The chat/completions endpoint (and model) just has a more human feel. This seems to be what OpenAI is calling “chat”. But a “chatbot” needs to have context to more fully ‘understand’ the conversation. And of course, this differs from a human-to-human, non-machine chat.

Some chat AI models are still able to “complete”, but as a task powered by their intelligence, because the flow of tokens is interrupted by the ChatML containers of messages and the unseen “assistant” prompt that indicates it is the AI’s turn to write.

Some are dumb and damaged as a side-effect of being only “chat” models of small size not suitable for arbitrary development.


The AI model is and always has been “stateless”. The chat training allows you to place previous user+assistant turns as you wish within those mandated containers to give the impression of memory of the past conversation. Your code builds and manages this conversation history.

While you must do the coding of managing session state and token budget to expend on sending past conversation, “chat” here reflects the overall model behavior and the impression it gives that you are talking to an AI helper that builds on your task and its previous answers. It is more natural to ask “write this code” than to write a hinted function with docstring to have AI complete a function.

“Assistants” provides an front-end where OpenAI does the managing of chat history if you want to give up that control of sending any messages you want.

Greatly appreciate your response. I’m teaching this stuff, so I wanted to be clear in my explanation.

Just to clarify, a chat model reflects human message exchange behavior (ie, the chat “model”), but it does not remember previous answers. A “chatbot” uses the “chat” model but additionally keeps a history of past msgs/responses in an ongoing conversation (prob using Google’s seminal conversational model paper) that it passes along to the LLM to further guide the neural network to produce a decent response. So “chat” and “chatbot” are two different but very related items.