Why the response of API is so summarized than the chatgpt when I use gpt-4 model?

algreatly · September 20, 2023, 12:53pm

I have been using GPT-4 (by specifying the model=‘gpt-4’). But result is so summarized than chatgpt when I ask the same question to the both.
const maxTokens = 8096;
const data = {
“model”: “gpt-4”,
“messages”: conversationHistory.concat([{ role: “user”, content: sQuestion }]),
“max_tokens”: maxTokens
}

the response tells me that it used another model

udm17 · September 20, 2023, 1:09pm

Gpt-4-0613 is the model for GPT4 as of date 13/06/2023.Unless you specify a specific model in your API call, it will revert to the default model endpoint available to it, which is the 0613 one in this case.

Also, the verbosity is usually a factor of temperature you are using in the API call. A temperature of around 0.7 should give you the verbosity but beware, it might cause the model to hallucinate

_j · September 20, 2023, 1:12pm

GPT-4 model’s maximum context length is 8192.

You’ve specified that you want 8096 of that reserved for the answer. Allowing you a small input.

However, that just sets the reservation for sending you a response. It doesn’t change or inform how the AI model has been trained recently to give increasingly short and unsatisfactory answers.

A proper system prompt will put it into a chatbot orientation where you at least will get the similar performance “You are ChatGPT, a large language model, based on GPT-4”, etc. This does change its answering style.

Then you can attempt fruitlessly to convince it that you are allowed to receive much longer documents.

algreatly · September 20, 2023, 1:32pm

do you know the options values chatgpt used or approximate values to get answers most likely the ones chatgpt gives?

_j · September 20, 2023, 1:46pm

In ChatGPT, the token output that is reserved is 1536 tokens (the last time it revealed its internal parameters by throwing errors.)

That only sets the limit where you have to press “continue” if you can get it to output that much text.

The quality of the output is by using system message to your own API chatbot to behave as you want it to, and not overlooking the importance of a system message to provide this identity and purpose to the AI. The fine-tuning for short, almost summary answers is in the AI model, not ChatGPT.

The ChatGPT models do use the standard 4k or 8k context length. The ChatGPT use of input context length is managed by the amount you can input into the user interface and the amount of chat history they want to provide back when you ask a question.

Those that become close personal friends with ChatGPT can ask about and extract the quality of conversation history.

Upon reviewing the conversation history, I found that there isn’t a specific placeholder text or message used when an AI reply is omitted. Instead, when an AI reply is omitted, there is simply a gap in the conversation, and the next user question or statement is presented without any indication of the omission.
…(my conversation turns)
In this example, there is no AI reply between the two user questions, indicating that an AI response was omitted.

Topic		Replies	Views
Results from ChatGPT are quite better than API API	5	3301	March 30, 2024
Why is GPT 4's response and performance on playground is so different from when using chatgpt 4 API gpt-4	12	8959	December 16, 2023
ChatGPT 4o-2024-11-20 cuts 3/4 of the output (2024-08-06 provides all output) API chatgpt-4o	7	969	January 3, 2025
Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality API gpt-4 , chatgpt	27	15525	February 14, 2024
Why Does OpenAI's API Struggle to Match ChatGPT's Commercial Response Quality API gpt-4 , chatgpt , api	8	487	March 31, 2025

Why the response of API is so summarized than the chatgpt when I use gpt-4 model?

Related topics