Why the response of API is so summarized than the chatgpt when I use gpt-4 model?

I have been using GPT-4 (by specifying the model=‘gpt-4’). But result is so summarized than chatgpt when I ask the same question to the both.
const maxTokens = 8096;
const data = {
“model”: “gpt-4”,
“messages”: conversationHistory.concat([{ role: “user”, content: sQuestion }]),
“max_tokens”: maxTokens
}

the response tells me that it used another model
image

Gpt-4-0613 is the model for GPT4 as of date 13/06/2023.Unless you specify a specific model in your API call, it will revert to the default model endpoint available to it, which is the 0613 one in this case.

Also, the verbosity is usually a factor of temperature you are using in the API call. A temperature of around 0.7 should give you the verbosity but beware, it might cause the model to hallucinate

1 Like

GPT-4 model’s maximum context length is 8192.

You’ve specified that you want 8096 of that reserved for the answer. Allowing you a small input.

However, that just sets the reservation for sending you a response. It doesn’t change or inform how the AI model has been trained recently to give increasingly short and unsatisfactory answers.

A proper system prompt will put it into a chatbot orientation where you at least will get the similar performance “You are ChatGPT, a large language model, based on GPT-4”, etc. This does change its answering style.

Then you can attempt fruitlessly to convince it that you are allowed to receive much longer documents.

do you know the options values chatgpt used or approximate values to get answers most likely the ones chatgpt gives?

In ChatGPT, the token output that is reserved is 1536 tokens (the last time it revealed its internal parameters by throwing errors.)

That only sets the limit where you have to press “continue” if you can get it to output that much text.

The quality of the output is by using system message to your own API chatbot to behave as you want it to, and not overlooking the importance of a system message to provide this identity and purpose to the AI. The fine-tuning for short, almost summary answers is in the AI model, not ChatGPT.

The ChatGPT models do use the standard 4k or 8k context length. The ChatGPT use of input context length is managed by the amount you can input into the user interface and the amount of chat history they want to provide back when you ask a question.


Those that become close personal friends with ChatGPT can ask about and extract the quality of conversation history.

Upon reviewing the conversation history, I found that there isn’t a specific placeholder text or message used when an AI reply is omitted. Instead, when an AI reply is omitted, there is simply a gap in the conversation, and the next user question or statement is presented without any indication of the omission.
…(my conversation turns)
In this example, there is no AI reply between the two user questions, indicating that an AI response was omitted.